Low-pressure flashing mechanisms in iso-octane liquid jets

Code comments are a key software component containing information about the underlying implementation. Several studies have shown that code comments enhance the readability of the code. Nevertheless, not all the comments have the same goal and target audience. In this paper, we investigate how six diverse Java OSS projects use code comments, with the aim of understanding their purpose. Through our analysis, we produce a taxonomy of source code comments; subsequently, we investigate how often each category occur by manually classifying more than 2,000 code comments from the aforementioned projects. In addition, we conduct an initial evaluation on how to automatically classify code comments at line level into our taxonomy using machine learning; initial results are promising and suggest that an accurate classification is within reach.

show abstract

Fine-grained just-in-time defect prediction

Pascarella

Palomba

Bacchelli

2019

Journal of Systems and Software

102

View full text Add to dashboard Cite

Defect prediction models focus on identifying defect-prone code elements, for example to allow practitioners to allocate testing resources on specific subsystems and to provide assistance during code reviews. While the research community has been highly active in proposing metrics and methods to predict defects on long-term periods (i.e., at release time), a recent trend is represented by the so-called short-term defect prediction (i.e., at commit-level). Indeed, this strategy represents an effective alternative in terms of effort required to inspect files likely affected by defects. Nevertheless, the granularity considered by such models might be still too coarse. Indeed, existing commit-level models highlight an entire commit as defective even in cases where only specific files actually contain defects. In this paper, we first investigate to what extent commits are partially defective; then, we propose a novel fine-grained just-in-time defect prediction model to predict the specific files, contained in a commit, that are defective. Finally, we evaluate our model in terms of (i) performance and (ii) the extent to which it decreases the effort required to diagnose a defect. Our study highlights that: (1) defective commits are frequently composed of a mixture of defective and nondefective files, (2) our fine-grained model can accurately predict defective files with an AUC-ROC up to 82% and (3) our model would allow practitioners to save inspection efforts with respect to standard just-in-time techniques.

show abstract

Towards Automating Code Review Activities

Tufano

Pascarella

Tufano

et al. 2021

View full text Add to dashboard Cite

Code reviews are popular in both industrial and open source projects. The benefits of code reviews are widely recognized and include better code quality and lower likelihood of introducing bugs. However, since code review is a manual activity it comes at the cost of spending developers time on reviewing their teammates code.Our goal is to make the first step towards partially automating the code review process, thus, possibly reducing the manual costs associated with it. We focus on both the contributor and

show abstract

Information Needs in Contemporary Code Review

Pascarella

Spadini

Palomba

et al. 2018

Proc. ACM Hum.-Comput. Interact.

View full text Add to dashboard Cite

Contemporary code review is a widespread practice used by software engineers to maintain high software quality and share project knowledge. However, conducting proper code review takes time and developers often have limited time for review. In this paper, we aim at investigating the information that reviewers need to conduct a proper code review, to better understand this process and how research and tool support can make developers become more effective and efficient reviewers. Previous work has provided evidence that a successful code review process is one in which reviewers and authors actively participate and collaborate. In these cases, the threads of discussions that are saved by code review tools are a precious source of information that can be later exploited for research and practice. In this paper, we focus on this source of information as a way to gather reliable data on the aforementioned reviewers' needs. We manually analyze 900 code review comments from three large open-source projects and organize them in categories by means of a card sort. Our results highlight the presence of seven high-level information needs, such as knowing the uses of methods and variables declared/modified in the code under review. Based on these results we suggest ways in which future code review tools can better support collaboration and the reviewing task. Preprint [

show abstract

An Empirical Study on the Usage of Transformer Models for Code Completion

Ciniselli¹,

Cooper²,

Pascarella³

et al. 2021

IIEEE Trans. Software Eng.

View full text Add to dashboard Cite

How is video game development different from software development in open source?

Pascarella

Palomba

Penta

et al. 2018

View full text Add to dashboard Cite

Recent research has provided evidence that, in the industrial context, developing video games diverges from developing software systems in other domains, such as office suites and system utilities. In this paper, we consider video game development in the open source system (OSS) context. Specifically, we investigate how developers contribute to video games vs. non-games by working on different kinds of artifacts, how they handle malfunctions, and how they perceive the development process of their projects. To this purpose, we conducted a mixed, qualitative and quantitative study on a broad suite of 60 OSS projects. Our results confirm the existence of significant differences between game and non-game development, in terms of how project resources are organized and in the diversity of developers' specializations. Moreover, game developers responding to our survey perceive more difficulties than other developers when reusing code as well as performing automated testing, and they lack a clear overview of their system's requirements.

show abstract

A graph-based dataset of commit history of real-world Android apps

Geiger

Malavolta

Pascarella

et al. 2018

View full text Add to dashboard Cite

Obtaining a good dataset to conduct empirical studies on the engineering of Android apps is an open challenge. To start tackling this challenge, we present AndroidTimeMachine, the first, self-contained, publicly available dataset weaving spread-out data sources about real-world, open-source Android apps. Encoded as a graph-based database, AndroidTimeMachine concerns 8,431 real open-source Android apps and contains: (i) metadata about the apps' GitHub projects, (ii) Git repositories with full commit history and (iii) metadata extracted from the Google Play store, such as app ratings and permissions.

show abstract

Classifying code comments in Java software systems

2019

View full text Add to dashboard Cite

Code comments are a key software component containing information about the underlying implementation. Several studies have shown that code comments enhance the readability of the code. Nevertheless, not all the comments have the same goal and target audience. In this paper, we investigate how 14 diverse Java open and closed source software projects use code comments, with the aim of understanding their purpose. Through our analysis, we produce a taxonomy of source code comments; subsequently, we investigate how often each category occur by manually classifying more than 40,000 lines of code comments from the aforementioned projects. In addition, we investigate how to automatically classify code comments at line level into our taxonomy using machine learning; initial results are promising and suggest that an accurate classification is within reach, even when training the machine learner on projects different than the target one. Data and Materials [

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Luca Pascarella

Classifying Code Comments in Java Open-Source Software Systems

Fine-grained just-in-time defect prediction

Towards Automating Code Review Activities

Information Needs in Contemporary Code Review

An Empirical Study on the Usage of Transformer Models for Code Completion

How is video game development different from software development in open source?

A graph-based dataset of commit history of real-world Android apps

Classifying code comments in Java software systems

Contact Info

Product

Resources

About