2017 IEEE International Conference on Software Maintenance and Evolution (ICSME) 2017
DOI: 10.1109/icsme.2017.14
|View full text |Cite
|
Sign up to set email alerts
|

Bug or Not? Bug Report Classification Using N-Gram IDF

Abstract: Previous studies have found that a significant number of bug reports are misclassified between bugs and nonbugs, and that manually classifying bug reports is a timeconsuming task. To address this problem, we propose a bug reports classification model with N-gram IDF, a theoretical extension of Inverse Document Frequency (IDF) for handling words and phrases of any length. N-gram IDF enables us to extract key terms of any length from texts, these key terms can be used as the features to classify bug reports. We … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
47
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
3
3
1

Relationship

0
7

Authors

Journals

citations
Cited by 54 publications
(47 citation statements)
references
References 16 publications
(27 reference statements)
0
47
0
Order By: Relevance
“…Zolkeply and Shao (2019) propose to not use TFM frequencies, but simply the occurrence of one of 60 keywords as binary features and use these to train a Classification Association Rule Mining (CARM). Terdchanakul et al (2017) propose to go beyond the TFM and instead use the Inverse Document Frequency (IDF) of n-grams for the title and descriptions of the issues as input for either LR or RF as classifier. Zhou et al (2016) propose an approach that combines the TFM from the issue title with structured information about the issue, e.g., the priority and the severity.…”
Section: Supervised Approachesmentioning
confidence: 99%
See 2 more Smart Citations
“…Zolkeply and Shao (2019) propose to not use TFM frequencies, but simply the occurrence of one of 60 keywords as binary features and use these to train a Classification Association Rule Mining (CARM). Terdchanakul et al (2017) propose to go beyond the TFM and instead use the Inverse Document Frequency (IDF) of n-grams for the title and descriptions of the issues as input for either LR or RF as classifier. Zhou et al (2016) propose an approach that combines the TFM from the issue title with structured information about the issue, e.g., the priority and the severity.…”
Section: Supervised Approachesmentioning
confidence: 99%
“…We implemented the approaches as they were described and refer to them by the family name of the first author, year of publication, and acronym for the classifier. The approaches from the literature we consider are (in alphabetatical order) Kallis2019-FT by Kallis Limsettho et al (2014a), and Terdchanakul2017-LR and Terdchanakul2017-RF by Terdchanakul et al (2017).…”
Section: Baselinesmentioning
confidence: 99%
See 1 more Smart Citation
“…To resolve these problems, artificial intelligence techniques are now actively being studied, and have shown better classification accuracy than traditional (non-artificial intelligence based) methods [25][26][27][28][29][30][31][32][33][34][35][36]. Such techniques can be a key to solving most of the current problems regarding this issue.…”
mentioning
confidence: 99%
“…To reduce the effort required in this regard, studies have proposed the application of state-of-the-art automation methods for bug report classification [25][26][27][28][29]. In particular, latent Dirichlet allocation (LDA)-based classification methods are common because they are suitable to bug reports that contain text-based data.…”
mentioning
confidence: 99%