One of the main challenges faced by today's developers is keeping up with the staggering amount of source code that needs to be read and understood. In order to help developers with this problem and reduce the costs associated with it, one solution is to use simple textual descriptions of source code entities that developers can grasp easily, while capturing the code semantics precisely. We propose an approach to automatically determine such descriptions, based on automated text summarization technology.
Although version control systems allow developers to describe and explain the rationale behind code changes in commit messages, the state of practice indicates that most of the time such commit messages are either very short or even empty. In fact, in a recent study of 23K+ Java projects it has been found that only 10% of the messages are descriptive and over 66% of those messages contained fewer words as compared to a typical English sentence (i.e., 15-20 words). However, accurate and complete commit messages summarizing software changes are important to support a number of development and maintenance tasks. In this paper we present an approach, coined as ChangeScribe, which is designed to generate commit messages automatically from change sets. ChangeScribe generates natural language commit messages by taking into account commit stereotype, the type of changes (e.g., files rename, changes done only to property files), as well as the impact set of the underlying changes. We evaluated ChangeScribe in a survey involving 23 developers in which the participants analyzed automatically generated commit messages from real changes and compared them with commit messages written by the original developers of six open source systems. The results demonstrate that automatically generated messages by ChangeScribe are preferred in about 62% of the cases for large commits, and about 54% for small commits.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.