Abstract:Being able to identify software discussions that are primarily about design-which we call design mining-can improve documentation and maintenance of software systems. Existing design mining approaches have good classification performance using natural language processing (NLP) techniques, but the conclusion stability of these approaches is generally poor. A classifier trained on a given dataset of software projects has so far not worked well on different artifacts or different datasets. In this study, we repli… Show more
“…Previous studies have introduced various vectorization techniques. In response to our previous study [15], we demonstrate how word embedding as a vectorization choice can improve the performance of the classifier. However, word embedding needs a reference model.…”
Section: How Useful Are Software-specific Word Vectorizers?mentioning
confidence: 68%
“…Early results from Robbes and Janes [23] reported on using ULMFiT [24] for sentiment analysis with some success. We also use the transfer NLP potential of ULMFiT, which we discuss in [15]. Robbes and Janes emphasized the importance of pretraining the learner on (potentially small) task-specific datasets.…”
Section: Cross-project Classifiers In Software Engineeringmentioning
Developer discussions range from in-person hallway chats to comment chains on bug reports. Being able to identify discussions that touch on software design would be helpful in documentation and refactoring software. Design mining is the application of machine learning techniques to correctly label a given discussion artifact, such as a pull request, as pertaining (or not) to design. In this paper we demonstrate a simple example of how design mining works. We then show how conclusion stability is poor on different artifact types and different projects. We show two techniques-augmentation and context specificity-that greatly improve the conclusion stability and cross-project relevance of design mining. Our new approach achieves AUC of 0.88 on within dataset classification and 0.80 on the cross-dataset classification task.
“…Previous studies have introduced various vectorization techniques. In response to our previous study [15], we demonstrate how word embedding as a vectorization choice can improve the performance of the classifier. However, word embedding needs a reference model.…”
Section: How Useful Are Software-specific Word Vectorizers?mentioning
confidence: 68%
“…Early results from Robbes and Janes [23] reported on using ULMFiT [24] for sentiment analysis with some success. We also use the transfer NLP potential of ULMFiT, which we discuss in [15]. Robbes and Janes emphasized the importance of pretraining the learner on (potentially small) task-specific datasets.…”
Section: Cross-project Classifiers In Software Engineeringmentioning
Developer discussions range from in-person hallway chats to comment chains on bug reports. Being able to identify discussions that touch on software design would be helpful in documentation and refactoring software. Design mining is the application of machine learning techniques to correctly label a given discussion artifact, such as a pull request, as pertaining (or not) to design. In this paper we demonstrate a simple example of how design mining works. We then show how conclusion stability is poor on different artifact types and different projects. We show two techniques-augmentation and context specificity-that greatly improve the conclusion stability and cross-project relevance of design mining. Our new approach achieves AUC of 0.88 on within dataset classification and 0.80 on the cross-dataset classification task.
“…As manual classification is not a practical option to classify 1, 661, 922 discussions, we use machine learning techniques. We followed the protocol of Brunet et al [19] with some improvisations suggested by Mahadi et al [45].…”
Section: Building the Discussion Classifiermentioning
confidence: 99%
“…Viviani et al [75] applied a classifier to automatically locate paragraphs in pull request discussions related to design. Mahadi et al [45] trained a classifier on the dataset created by Brunet et al [19] and tested it on the dataset of Viviani et al [75]. However, both of the dataset include discussions only from pull requests.…”
Section: Related Workmentioning
confidence: 99%
“…For example, Brunet et al [19] and Viviani et al [75,76] examined how design discussions are embedded in pull request comments and how it can be difficult for developers to piece together these discussions. Researchers have also developed techniques for detecting design discussions from a (single) communication channel using Machine Learning (ML) techniques [19,45,75,76].…”
Developer discussions range from in-person hallway chats to comment chains on bug reports. Being able to identify discussions that touch on software design would be helpful in documentation and refactoring software. Design mining is the application of machine learning techniques to correctly label a given discussion artifact, such as a pull request, as pertaining (or not) to design. In this work we demonstrate a simple example of how design mining works. We first replicate an existing state-of-the-art design mining study to show how conclusion stability is poor on different artifact types and different projects. Then we introduce two techniques-augmentation and context specificity-that greatly improve the conclusion stability and cross-project relevance of design mining. Our new approach achieves AUC-ROC of 0.88 on within dataset classification and 0.84 on the cross-dataset classification task. iv Contents Supervisory Committee
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.