Empirical studies have revealed that software developers spend 35%-50% of their time navigating through source code during development activities, yet fundamental questions remain: Are these percentages too high, or simply inherent in the nature of software development? Are there factors that somehow determine a lower bound on how effectively developers can navigate a given information space? Answering questions like these requires a theory that captures the core of developers' navigation decisions. Therefore, we use the central proposition of Information Foraging Theory to investigate developers' ability to predict the value and cost of their navigation decisions. Our results showed that over 50% of developers' navigation choices produced less value than they had predicted and nearly 40% cost more than they had predicted. We used those results to guide a literature analysis, to investigate the extent to which these challenges are met by current research efforts, revealing a new area of inquiry with a rich and crosscutting set of research challenges and open problems. CCS Concepts • Software and its engineering➝Software notations and tools • Software and its engineering➝Software creation and management
Context
Tangled commits are changes to software that address multiple concerns at once. For researchers interested in bugs, tangled commits mean that they actually study not only bugs, but also other concerns irrelevant for the study of bugs.
Objective
We want to improve our understanding of the prevalence of tangling and the types of changes that are tangled within bug fixing commits.
Methods
We use a crowd sourcing approach for manual labeling to validate which changes contribute to bug fixes for each line in bug fixing commits. Each line is labeled by four participants. If at least three participants agree on the same label, we have consensus.
Results
We estimate that between 17% and 32% of all changes in bug fixing commits modify the source code to fix the underlying problem. However, when we only consider changes to the production code files this ratio increases to 66% to 87%. We find that about 11% of lines are hard to label leading to active disagreements between participants. Due to confirmed tangling and the uncertainty in our data, we estimate that 3% to 47% of data is noisy without manual untangling, depending on the use case.
Conclusion
Tangled commits have a high prevalence in bug fixes and can lead to a large amount of noise in the data. Prior research indicates that this noise may alter results. As researchers, we should be skeptics and assume that unvalidated data is likely very noisy, until proven otherwise.
Programmers spend considerable time navigating source code, and we recently proposed the Patchworks code editor to help address this problem. A prior preliminary study of Patchworks found that it significantly reduced programmer navigation time and navigation errors. In this paper, we expand on these findings by investigating the effect of various patch-arranging strategies in Patchworks. To evaluate these strategies, we ran a simulation study based on actual programmer navigation data. Our simulator results showed (1) that none of the strategies tested had a significant effect on programmer-navigation time, and (2) that navigating code using Patchworks, regardless of strategy, was significantly faster than using Eclipse.
Design principles are a key tool for creators of interactive systems; however, a cohesive set of principles has yet to emerge for the design of code editors. In this paper, we conducted a between-subjects empirical study comparing the navigation behaviors of 32 professional LabVIEW programmers using two different code-editor interfaces: the ubiquitous tabbed editor and the experimental Patchworks editor. Our analysis focused on how the programmers arranged and navigated among open information patches (i.e., code modules and program output). Key findings of our study included that Patchworks users made significantly fewer click actions per navigation, juxtaposed patches side by side significantly more, and exhibited significantly fewer navigation mistakes than tabbed-editor users. Based on these findings and more, we propose five general principles for the design of effective navigation affordances in code editors.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.