Abstract. Refactoring source code has many benets (e.g. improving maintainability, robustness and source code quality), but it takes time away from other implementation tasks, resulting in developers neglecting refactoring steps during the development process. But what happens when they know that the quality of their source code needs to be improved and they can get the extra time and money to refactor the code? What will they do? What will they consider the most important for improving source code quality? What sort of issues will they address rst or last and how will they solve them? In our paper, we look for answers to these questions in a case study of refactoring large-scale industrial systems where developers participated in a project to improve the quality of their software systems. We collected empirical data of over a thousand refactoring patches for 5 systems with over 5 million lines of code in total, and we found that developers really optimized the refactoring process to signicantly improve the quality of these systems.
a b s t r a c tContext: Program queries play an important role in several software evolution tasks like program comprehension, impact analysis, or the automated identification of anti-patterns for complex refactoring operations. A central artifact of these tasks is the reverse engineered program model built up from the source code (usually an Abstract Semantic Graph, ASG), which is traditionally post-processed by dedicated, hand-coded queries. Objective: Our paper investigates the costs and benefits of using the popular industrial Eclipse Modeling Framework (EMF) as an underlying representation of program models processed by four different general-purpose model query techniques based on native Java code, OCL evaluation and (incremental) graph pattern matching. Method: We provide in-depth comparison of these techniques on the source code of 28 Java projects using anti-pattern queries taken from refactoring operations in different usage profiles. Results: Our results show that general purpose model queries can outperform hand-coded queries by 2-3 orders of magnitude, with the trade-off of an increased in memory consumption and model load time of up to an order of magnitude. Conclusion: The measurement results of usage profiles can be used as guidelines for selecting the appropriate query technologies in concrete scenarios.
The quality of a software system is mostly defined by its source code. Software evolves continuously, it gets modified, enhanced, and new requirements always arise. If we do not spend time periodically on improving our source code, it becomes messy and its quality will decrease inevitably. Literature tells us that we can improve the quality of our software product by regularly refactoring it. But does refactoring really increase software quality? Can it happen that a refactoring decreases the quality? Is it possible to recognize the change in quality caused by a single refactoring operation? In our paper, we seek answers to these questions in a case study of refactoring large-scale proprietary software systems. We analyzed the source code of 5 systems, and measured the quality of several revisions for a period of time. We analyzed 2 million lines of code and identified nearly 200 refactoring commits which fixed over 500 coding issues. We found that one single refactoring only makes a small change (sometimes even decreases quality), but when we do them in blocks, we can significantly increase quality, which can result not only in the local, but also in the global improvement of the code.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.