Background: Bug-fxing is the crux of software maintenance. It entails tending to heaps of bug reports using limited resources. Using historical data, we can ask questions that contribute to betterinformed allocation heuristics. The caveat here is that often there is not enough data to provide a sound response. This issue is especially prominent for young projects. Also, answers may vary from project to project. Consequently, it is impossible to generalize results without assuming a notion of relatedness between projects. Aims: Evaluate the independent impact of three report features in the bug-fxing time (BFT), generalizing results from many projects: bug priority, code-churn size in bug fxing commits, and existence of links to other reports (e.g., depends on or blocks other bug reports). Method: We analyze 55 projects from the Apache ecosystem using Bayesian statistics. Similar to standard random efects methodology, we assume each project's average BFT is a dispersed version of a global average BFT that we want to assess. We split the data based on feature values/range (e.g., with or without links). For each split, we compute a posterior distribution over its respective global BFT. Finally, we compare the posteriors to establish the feature's efect on the BFT. We run independent analyses for each feature. Results: Our results show that the existence of links and higher code-churn values lead to BFTs that are at least twice as long. On the other hand, considering three levels of priority (low, medium, and high), we observe no diference in the BFT. Conclusion: To the best of our knowledge, this is the frst study using hierarchical Bayes to extrapolate results from multiple projects and assess the global efect of diferent attributes on the BFT. We use this methodology to gain insight on how links, priority, and code-churn size impact the BFT. On top of that, our posteriors can be used as a prior to analyze novel projects, potentially young and scarce on data. We also believe our methodology can be reused for other generalization studies in empirical software engineering.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.