Shubhra Goyal Jindal scite author profile

2020

IEEE Access

Text Summarization is a process which efficiently retrieves the relevant information from documents. The objective of the proposed, unsupervised approach is to summarize bug reports (software artefacts) with complete content and diversified information. The proposed approach utilizes Rapid Automatic Keyword Extraction and term frequency-inverse document frequency method to extract meaningful keywords and key-phrases with a relevant score. For sentence extraction, fuzzy C-means clustering is used to extracts sentences having high degree of membership from each cluster above a set threshold value. A ruleengine is used for sentence selection. The rules are generated with the domain knowledge and based on the extracted information by the keywords and sentences selected by the clustering method. Cohesive and coherent summary is generated by the proposed method on apache bug reports. For redundancy removal and to re-rank generated summary, hierarchical clustering is presented to enrich the extracted summary. The proposed approach is evaluated on newly constructed Apache project Bug Report Corpus (APBRC) and existing Bug Report Corpus (BRC). The results are compared on the basis of performance metrics such as precision, recall, pyramid precision and F-score. The experimental results depict that our proposed approach attains significant improvement over other baseline approaches such as BRC and LRCA. It also attains significant improvement over existing state-of-art unsupervised approaches such as Hurried, centroid and others. It extracts significant keyword phrases and sentences from each cluster to achieve full coverage and coherent summary. The results evaluated on APBRC corpus attains an average value of 78.22%, 82.18%, 80.10% and 81.66% for precision, recall, f-score and pyramid precision respectively. INDEX TERMS Text summarization, rapid automatic keyword extraction, fuzzy c-means, hierarchical clustering, bug reports, rule engine.

show abstract

Severity Prediction Of Bug Reports using Text Mining: A Systematic Review

2018

Bug report collection system (BRCS)

2017

Text analytics based severity prediction of software bugs for apache projects

Int J Syst Assur Eng Manag

2019

Severity i.e impact, extent and effect on software is a decisive attribute which decides how instantly the bug should be fixed. Predicting the severity of software bugs is important to improve the bug triaging and resolution process. To reduce the effort and time required in manual assessment of severity of newly reported bugs, many techniques and methods are used in past researches. To help software developers to utilize their resources efficiently, this study evaluates a number of machine learning techniques for predicting the severity of software bugs at system and component level. The techniques are evaluated on thirteen apache projects automatically extracted using the Bug Report Collection System tool. Severity is predicted based on the most frequent terms extracted from the summary of bugs using text mining. Performance metrics such as precision, recall and accuracy are used to interpret the results obtained from various techniques. The result of the study advocates that Boosting (an ensemble learner) technique outperforms other machine learning techniques such as Bayesian learners, decision tree, support vector machine applied in previous researches.

show abstract

Information Retrieval from Software Bug Ontology Exploiting Formal Concept Analysis

Jindal¹,

Kaur²

2020

CyS