This paper analyzes the correlations between the problem domain measures such as the number of distinct nouns and distinct verbs in the requirements artifacts and the solution domain measures such as the number of software classes and methods in the corresponding object-oriented software. For this purpose, 14 completed software development projects of a CMMI Level-3 certified defense industry company have been analyzed. The observed strong correlation is taken as the indication of linear relationship between the measures and a size estimation model based on linear regression analysis is proposed. Prediction performance of the method is analyzed on the 14 software projects. Moreover, it has been observed that there is a high correlation between the problem domain measures and the development effort. Therefore, the linear regression analysis is also used to estimate the effort in terms of the problem domain measures. The effort estimations are also evaluated and compared with the efforts predicted using the size measured by the COSMIC Function Point (CFP) method. The results show that the proposed method provides more accurate effort estimates compared to the effort estimated by using CFP size measurement.
It is an important issue in the software industry to predict how much effort will be required for a software project as early as possible. Software size is one of the commonly used attributes in effort estimation. In this paper, we propose an early software size and effort estimation method based on conceptual model of the problem domain. Our method utilizes the noteworthy domain concepts identified mainly from the use cases written in the requirements phase of the software development lifecycle. In order to develop the model and evaluate its prediction quality, the use cases written and the effort data collected for 14 industrial software development projects of a CMMI level 3 certified defense industry company have been used. Evaluation results reveal a high correlation between the number of conceptual classes identified (i.e., domain objects) during the requirements analysis, the number of classes constituting the resulting software and the actual effort spent. Moreover, we have used the use case point (UCP) method to estimate the effort needed for each project and compared the results of UCP analysis with the results obtained with our method. The comparisons have shown that, for the projects considered, our method gives a better effort estimation compared to the effort estimated by using the UCP method.
Predicting how much effort will be required to complete a software project as early as possible is a very important factor in the success of software development projects. Including function points and its variants, there are several size measures and corresponding measurement methods that can be used for effort estimation. However, in most of the projects, there is limited amount of information available in the early stages and significant effort is spent for size measurement and effort estimation with such methods. This paper analyzes the correlation between the size metrics of conceptual model of the problem domain and the resulting software. For this purpose, we consider open source project management and game software. We apply linear regression and cross validation techniques to investigate the relation between the sizes of problem domain (i.e., conceptual) and solution domain (i.e., design) models. The results reveal a high correlation between the number of conceptual classes in the problem domain model and the number of software classes constituting the corresponding software. The results suggest that it is possible to use problem domain descriptions in the early stages of software development projects to make plausible predictions for the size of the software.
The cohesion value is one of the important factors used to evaluate software maintainability. However, measuring the cohesion value is a relatively difficult issue when tracing the source code manually. Although there are many static code analysis tools, not every tool measures every metric. The user should apply different tools for different metrics. In this study, besides the use of these tools, we predicted the cohesion values (LCOM2, TCC, LCC, and LSCC) with machine learning techniques (KNN, REPTree, multi-layer perceptron, linear regression (LR), support vector machine, and random forest (RF)) to solve them alternatively. We created two datasets utilizing two different open-source software projects. According to the obtained results, for the LCOM2 and TCC metrics, the KNN algorithm provided the best results, and for LCC and LSCC metrics, the REPTree algorithm was the best. However, out of all the metrics, RF, REPTree, and KNN had close performances with each other, and therefore any of the RF, REPTree, and KNN techniques can be used for software cohesion metric prediction.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.