Technical debt expresses the use of non-optimal solutions for short-term gains. Self-admitted technical debt is a technical debt that is deliberately introduced and recognized by developers in source code comments, including design debt, requirement debt, and defect debt, etc. Currently, many methods are proposed to detect SATD. However, these methods are limited to the identification of SATD or non-SATD. In this paper, we propose a CNN-BiLSTM method to detect and classify SATD. Through our cross-project experiments on 10 projects, it is shown that our method can not only effectively detect SATD, but also classify design debt, requirement debt, and defect debt in SATD.
Effectively identifying self-admitted technical debt (SATD) from project source code comments helps developers quickly find and repay these debts, thereby reducing its negative impact. Previous studies used techniques based on patterns, text mining, natural language processing, and neural networks to detect SATD. Compared with these above, Convolutional Neural Networks (CNN) have the strong feature extraction ability. Deep network ensembles are demonstrated great potential for the task of sentences classification. In order to boost the performance of CNN-based SATD detecting, we propose a deep neural network ensemble contribute to ensemble learning in a simple yet effective way. Specifically, CNN, CNN-LSTM (convolutional neural network and long short-term memory), and DPCNN (Deep Pyramid Convolutional Neural Networks) are used as individual classifiers to diversify the deep network ensembles. In order to improve the explainability, we introduce attention to measure the contribution of feature words to SATD classification. 62,285 source code comments from 10 projects were used in our experiments. The results show that our approach can effectively reduce misjudgment and detect more SATD, especially for cross-project, so as to greatly improve the detection accuracy.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.