In order to increase our ability to use measurement to support software development practise we need to do more analysis of code. However, empirical studies of code are expensive and their results are difficult to compare. We describe the Qualitas Corpus, a large curated collection of open source Java systems. The corpus reduces the cost of performing large empirical studies of code and supports comparison of measurements of the same artifacts. We discuss its design, organisation, and issues associated with its development.
Large amounts of Java software have been written since the language's escape into unsuspecting software ecology more than ten years ago. Surprisingly little is known about the structure of Java programs in the wild: about the way methods are grouped into classes and then into packages, the way packages relate to each other, or the way inheritance and composition are used to put these programs together. We present the results of the first in-depth study of the structure of Java programs. We have collected a number of Java programs and measured their key structural attributes. We have found evidence that some relationships follow power-laws, while others do not. We have also observed variations that seem related to some characteristic of the application itself. This study provides important information for researchers who can investigate how and why the structural relationships we find may have originated, what they portend, and how they can be managed.
Advocates of the design principle avoid cyclic dependencies among modules have argued that cycles are detrimental to software quality attributes such as understandability, testability, reusability, buildability and maintainability, yet folklore suggests such cycles are common in real object-oriented systems. In this paper we present the first significant empirical study of cycles among the classes of 78 open-and closed-source Java applications. We find that, of the applications comprising enough classes to support such a cycle, about 45% have a cycle involving at least 100 classes and around 10% have a cycle involving at least 1,000 classes. We present further empirical evidence to support the contention these cycles are not due to intrinsic interdependencies between particular classes in a domain. Finally, we attempt to gauge the strength of connection among the classes in a cycle using the concept of a minimum edge feedback set.
Over the years many guidelines have been offered as to how to achieve good quality designs. We would like to be able to determine to what degree these guidelines actually help. To do that, we need to be able to determine when the guidelines have been followed. This is often difficult as the guidelines are often presented as heuristics or otherwise not completely specified. Nevertheless, we believe it is important to gather quantitative data on the effectiveness of design guidelines wherever possible.In this paper, we examine the use of "Dependency Injection", which is a design principle that is claimed to increase software design quality attributes such as extensibility, modifiability, testability, and reusability. We develop operational definitions for it and analysis techniques for detecting its use. We demonstrate these techniques by applying them to 34 open source Java applications.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.