Zhaoqiang Guo scite author profile

Background. Self-admitted technical debt (SATD) is a special kind of technical debt that is intentionally introduced and remarked by code comments. Those technical debts reduce the quality of software and increase the cost of subsequent software maintenance. Therefore, it is necessary to find out and resolve these debts in time. Recently, many automatic approaches have been proposed to identify SATD. Problem. Popular IDEs support a number of predefined task annotation tags for indicating SATD in comments, which have been used in many projects. However, such clear prior knowledge is neglected by existing SATD identification approaches when identifying SATD. Objective. We aim to investigate how far we have really progressed in the field of SATD identification by comparing existing approaches with a simple approach that leverages the predefined task tags to identify SATD. Method. We first propose a simple heuristic approach that fuzzily Matches task Annotation Tags ( MAT ) in comments to identify SATD. In nature, MAT is an unsupervised approach, which does not need any data to train a prediction model and has a good understandability. Then, we examine the real progress in SATD identification by comparing MAT against existing approaches. Result. The experimental results reveal that: (1) MAT has a similar or even superior performance for SATD identification compared with existing approaches, regardless of whether non-effort-aware or effort-aware evaluation indicators are considered; (2) the SATDs (or non-SATDs) correctly identified by existing approaches are highly overlapped with those identified by MAT ; and (3) supervised approaches misclassify many SATDs marked with task tags as non-SATDs, which can be easily corrected by their combinations with MAT . Conclusion. It appears that the problem of SATD identification has been (unintentionally) complicated by our community, i.e., the real progress in SATD comments identification is not being achieved as it might have been envisaged. We hence suggest that, when many task tags are used in the comments of a target project, future SATD identification studies should use MAT as an easy-to-implement baseline to demonstrate the usefulness of any newly proposed approach.

show abstract

Inconsistent Defect Labels: Essence, Causes, and Influence

Liu

Guo

et al. 2023

IIEEE Trans. Software Eng.

View full text Add to dashboard Cite

Boosting crash-inducing change localization with rank-performance-based feature subset selection

Guo

et al. 2020

Empir Software Eng

View full text Add to dashboard Cite

Deriving Thresholds of Object-Oriented Metrics to Predict Defect-Proneness of Classes: A Large-Scale Meta-Analysis

Mei

Rong

Liu

et al. 2023

Int. J. Soft. Eng. Knowl. Eng.

View full text Add to dashboard Cite

Many studies have explored the methods of deriving thresholds of object-oriented (i.e. OO) metrics. Unsupervised methods are mainly based on the distributions of metric values, while supervised methods principally rest on the relationships between metric values and defect-proneness of classes. The objective of this study is to empirically examine whether there are effective threshold values of OO metrics by analyzing existing threshold derivation methods with a large-scale meta-analysis. Based on five representative threshold derivation methods (i.e. VARL, ROC, BPP, MFM, and MGM) and 3268 releases from 65 Java projects, we first employ statistical meta-analysis and sensitivity analysis techniques to derive thresholds for 62 OO metrics on the training data. Then, we investigate the predictive performance of five candidate thresholds for each metric on the validation data to explore which of these candidate thresholds can be served as the threshold. Finally, we evaluate their predictive performance on the test data. The experimental results show that 26 of 62 metrics have the threshold effect and the derived thresholds by meta-analysis achieve promising results of GM values and significantly outperform almost all five representative (baseline) thresholds.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Zhaoqiang Guo

MAT: A simple yet strong baseline for identifying self-admitted technical debt

How Far Have We Progressed in Identifying Self-admitted Technical Debts? A Comprehensive Empirical Study

Inconsistent Defect Labels: Essence, Causes, and Influence

Boosting crash-inducing change localization with rank-performance-based feature subset selection

Deriving Thresholds of Object-Oriented Metrics to Predict Defect-Proneness of Classes: A Large-Scale Meta-Analysis

Contact Info

Product

Resources

About