Bug or Not? Bug Report Classification Using N-Gram IDF

Terdchanakul, Pannavat; Hata, Hideaki; Phannachitta, Passakorn; Matsumoto, Kenichi

doi:10.1109/icsme.2017.14

Cited by 54 publications

(47 citation statements)

References 16 publications

(27 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Zolkeply and Shao (2019) propose to not use TFM frequencies, but simply the occurrence of one of 60 keywords as binary features and use these to train a Classification Association Rule Mining (CARM). Terdchanakul et al (2017) propose to go beyond the TFM and instead use the Inverse Document Frequency (IDF) of n-grams for the title and descriptions of the issues as input for either LR or RF as classifier. Zhou et al (2016) propose an approach that combines the TFM from the issue title with structured information about the issue, e.g., the priority and the severity.…”

Section: Supervised Approachesmentioning

confidence: 99%

“…We implemented the approaches as they were described and refer to them by the family name of the first author, year of publication, and acronym for the classifier. The approaches from the literature we consider are (in alphabetatical order) Kallis2019-FT by Kallis Limsettho et al (2014a), and Terdchanakul2017-LR and Terdchanakul2017-RF by Terdchanakul et al (2017).…”

Section: Baselinesmentioning

confidence: 99%

“…Researchers suggested an alternative through the automated classification of issue types by analyzing the issue titles and descriptions with unsupervised machine learning based on clustering the issues Limsettho et al (2014b), Hammad et al (2018) and Chawla and Singh (2018) and supervised machine learning that create classification models (Antoniol et al 2008;Pingclasai et al 2013;Limsettho et al 2014a;Chawla and Singh 2015;Zhou et al 2016;Terdchanakul et al 2017;Pandey et al 2018;Qin and Sun 2018;Zolkeply and Shao 2019;Otoom et al 2019;Kallis et al 2019). There are two possible use cases for such automated classification models.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

On the feasibility of automated prediction of bug and non-bug issues

Herbold

Trautsch

2020

Empir Software Eng

View full text Add to dashboard Cite

Context Issue tracking systems are used to track and describe tasks in the development process, e.g., requested feature improvements or reported bugs. However, past research has shown that the reported issue types often do not match the description of the issue. Objective We want to understand the overall maturity of the state of the art of issue type prediction with the goal to predict if issues are bugs and evaluate if we can improve existing models by incorporating manually specified knowledge about issues. Method We train different models for the title and description of the issue to account for the difference in structure between these fields, e.g., the length. Moreover, we manually detect issues whose description contains a null pointer exception, as these are strong indicators that issues are bugs. Results Our approach performs best overall, but not significantly different from an approach from the literature based on the fastText classifier from Facebook AI Research. The small improvements in prediction performance are due to structural information about the issues we used. We found that using information about the content of issues in form of null pointer exceptions is not useful. We demonstrate the usefulness of issue type prediction through the example of labelling bugfixing commits. Conclusions Issue type prediction can be a useful tool if the use case allows either for a certain amount of missed bug reports or the prediction of too many issues as bug is acceptable.

show abstract

Section: Supervised Approachesmentioning

confidence: 99%

Section: Baselinesmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

On the feasibility of automated prediction of bug and non-bug issues

Herbold

Trautsch

2020

Empir Software Eng

View full text Add to dashboard Cite

show abstract

“…To resolve these problems, artificial intelligence techniques are now actively being studied, and have shown better classification accuracy than traditional (non-artificial intelligence based) methods [25][26][27][28][29][30][31][32][33][34][35][36]. Such techniques can be a key to solving most of the current problems regarding this issue.…”

mentioning

confidence: 99%

“…To reduce the effort required in this regard, studies have proposed the application of state-of-the-art automation methods for bug report classification [25][26][27][28][29]. In particular, latent Dirichlet allocation (LDA)-based classification methods are common because they are suitable to bug reports that contain text-based data.…”

mentioning

confidence: 99%

Improving bug report triage performance using artificial intelligence based document generation model

Lee

Seo

2020

Hum. Cent. Comput. Inf. Sci.

View full text Add to dashboard Cite

Along with the fourth industrial revolution, artificial intelligence, big data, Internet of Things, and cloud computing are emerging as cutting-edge technologies globally. In particular, artificial intelligence has unlimited potential to further improve the quality of human life and can solve several difficult engineering problems [1-12]. Moreover, this technology provides basic ideas to derive successful solutions to numerous problems encountered in the software development field.

show abstract

A Process Framework for the Classification of Security Bug Reports

Hussain¹

2022

Evolving Software Processes

View full text Add to dashboard Cite

Bug or Not? Bug Report Classification Using N-Gram IDF

Cited by 54 publications

References 16 publications

On the feasibility of automated prediction of bug and non-bug issues

On the feasibility of automated prediction of bug and non-bug issues

Improving bug report triage performance using artificial intelligence based document generation model

A Process Framework for the Classification of Security Bug Reports

Contact Info

Product

Resources

About