Austin Z. Henley scite author profile

Empirical studies have revealed that software developers spend 35%-50% of their time navigating through source code during development activities, yet fundamental questions remain: Are these percentages too high, or simply inherent in the nature of software development? Are there factors that somehow determine a lower bound on how effectively developers can navigate a given information space? Answering questions like these requires a theory that captures the core of developers' navigation decisions. Therefore, we use the central proposition of Information Foraging Theory to investigate developers' ability to predict the value and cost of their navigation decisions. Our results showed that over 50% of developers' navigation choices produced less value than they had predicted and nearly 40% cost more than they had predicted. We used those results to guide a literature analysis, to investigate the extent to which these challenges are met by current research efforts, revealing a new area of inquiry with a rich and crosscutting set of research challenges and open problems. CCS Concepts • Software and its engineering➝Software notations and tools • Software and its engineering➝Software creation and management

show abstract

To fix or to learn? How production bias affects developers' information foraging during debugging

Piorkowski

Fleming

Scaffidi

et al. 2015

View full text Add to dashboard Cite

A fine-grained data set and analysis of tangling in bug fixing commits

et al. 2022

View full text Add to dashboard Cite

Context Tangled commits are changes to software that address multiple concerns at once. For researchers interested in bugs, tangled commits mean that they actually study not only bugs, but also other concerns irrelevant for the study of bugs. Objective We want to improve our understanding of the prevalence of tangling and the types of changes that are tangled within bug fixing commits. Methods We use a crowd sourcing approach for manual labeling to validate which changes contribute to bug fixes for each line in bug fixing commits. Each line is labeled by four participants. If at least three participants agree on the same label, we have consensus. Results We estimate that between 17% and 32% of all changes in bug fixing commits modify the source code to fix the underlying problem. However, when we only consider changes to the production code files this ratio increases to 66% to 87%. We find that about 11% of lines are hard to label leading to active disagreements between participants. Due to confirmed tangling and the uncertainty in our data, we estimate that 3% to 47% of data is noisy without manual untangling, depending on the use case. Conclusion Tangled commits have a high prevalence in bug fixes and can lead to a large amount of noise in the data. Prior research indicates that this noise may alter results. As researchers, we should be skeptics and assume that unvalidated data is likely very noisy, until proven otherwise.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Austin Z. Henley

What's Wrong with Computational Notebooks? Pain Points, Needs, and Design Opportunities

Foraging and navigations, fundamentally: developers' predictions of value and cost

To fix or to learn? How production bias affects developers' information foraging during debugging

A fine-grained data set and analysis of tangling in bug fixing commits

Contact Info

Product

Resources

About