A comparison of effort estimation methods for 4GL programs: experiences with Statistics and Data Mining

José C. Riquelme, Macario Polo, Jesús S. Aguilar, Mario Piattini, Francisco J. Ferrer, Francisco Ruiz

Abstract

This paper presents an empirical study analysing the relationship between a set of metrics for Fourth{Generation Languages (4GL) programs and their maintainability. An analysis has been made using historical data of several industrial projects and three different approaches: the first one relates metrics and maintainability based on techniques of descriptive statistics, and the other two are based on Data Mining techniques. A discussion on the results obtained with the three techniques is also presented, as well as a set of equations and rules for predicting the maintenance e®ort in this kind of programs.
Finally, we have done experiments about the prediction accuracy of these methods by using new unseen data, which were not used to build the knowledge model. The results were satisfactory as the application of each technique separately provides useful perspective for the manager in order to get a complementary insight from data.

International Journal of Software Engineering and Knowledge Engineering, 16(1), 127-140.