Software clustering is usually used for program understanding. Since the software clustering is a NP-complete problem, a number of Genetic Algorithms (GAs) are proposed for solving this problem. In literature, there are two wellknown GAs for software clustering, namely, Bunch and DAGC, that use the genetic operators such as crossover and mutation to better search the solution space and generating better solutions during genetic algorithm evolutionary process. The major drawbacks of these operators are (1) the difficulty of defining operators, (2) the difficulty of determining the probability rate of these operators, and (3) do not guarantee to maintain building blocks. Estimation of Distribution (EDA) based approaches, by removing crossover and mutation operators and maintaining building blocks, can be used to solve the problems of genetic algorithms. This approach creates the probabilistic models from individuals to generate new population during evolutionary process, aiming to achieve more success in solving the problems. The aim of this paper is to recast EDA for software clustering problems, which can overcome the existing genetic operators’ limitations. For achieving this aim, we propose a new distribution probability function and a new EDA based algorithm for software clustering. To the best knowledge of the authors, EDA has not been investigated to solve the software clustering problem. The proposed EDA has been compared with two well-known genetic algorithms on twelve benchmarks. Experimental results show that the proposed approach provides more accurate results, improves the speed of convergence and provides better stability when compared against existing genetic algorithms such as Bunch and DAGC.
Software clustering is usually used for program comprehension. Since it is considered to be the most crucial NP-complete problem, several genetic algorithms have been proposed to solve this problem. In the literature, there exist some objective functions (i.e., fitness functions) which are used by genetic algorithms for clustering. These objective functions determine the quality of each clustering obtained in the evolutionary process of the genetic algorithm in terms of cohesion and coupling. The major drawbacks of these objective functions are the inability to (1) consider utility artifacts, and (2) to apply to another software graph such as artifact feature dependency graph. To overcome the existing objective functions' limitations, this paper presents a new objective function. The new objective function is based on information theory, aiming to produce a clustering in which information loss is minimized. For applying the new proposed objective function, we have developed a genetic algorithm aiming to maximize the proposed objective function. The proposed genetic algorithm, named ILOF, has been compared to that of some other well-known genetic algorithms. The results obtained confirm the high performance of the proposed algorithm in solving nine software systems. The performance achieved is quite satisfactory and promising for the tested benchmarks.
Lack of up-to-date software documentation hinders the software evolution and maintenance processes, as simply the software structure and code could be easily misunderstood and not comprehended. One approach to overcome such problems is to extract and reconstruct the software structure from the available source code so that it can be assessed against the required changes. Such approach is known as Software clustering. Bunch is a well-known tool for structure reconstructing and maintaining of software systems. As the maintenance of software costs a lot, Bunch is usually employed to reconstruct the structure of software and consequently making required changes. The aim of this paper is to investigate the strengths and weaknesses of the Bunch; if problems and proposals offered in this paper are studied and solved, this tool will be turned to be a useful tool for structure recovering and maintenance of a software system.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.