New labeled dataset of interconnected lexical typos for automatic correction in the bug reports

Neysiani, Behzad Soleimani; Babamir, Seyed Morteza

doi:10.1007/s42452-019-1419-y

Cited by 6 publications

(1 citation statement)

References 12 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The first methodology called the information retrieval-based approach, which its procedure is shown in Figure 1. In the first box, the raw dataset of bug reports exists which should be preprocessed in box 2 till deal with null values, unify the data type of some fields like version and priority and preferably change them to numerical, remove stop words from textual fields, stemming textual fields, correcting the typos in textual DFs [5,8] [9], and so on [1,4]. The feature extraction phase of box 6 returns a numerical vector consist of many similarity metrics as box 7.…”

Section: Information Retrieval (Ir)-based Methodology Of Automatic Duplicate Bug Report Detection (Adbrd)mentioning

confidence: 99%

Duplicate Detection Models for Bug Reports of Software Triage Systems: A Survey

Neysiani¹

2019

CTCSA

Self Cite

View full text Add to dashboard Cite

Categorical DFs such as company, product, component, and status of bug report which are grouping the bug report in specific categories. III.Textual DFs contain the main end-user request which is described as a text message in short or long description, e.g., title or description. IV.Temporal DFs show the Date Time of reporting, assigning, solving and other events about the bug report. Since there are about 30%-60% duplicate bug reports in a STS [2,3], automatic duplicate bug report detection (ADBRD) is one of major problems of STSs. ADBRD needs artificial intelligence techniques like information retrieval, natural language processing, machine learning, text, and data mining. This study focuses on methods of ADBRD and review its methodologies, compare them, and suggest their potential usage [4]. Methodologies of Automatic Duplicate Bug Report DetectionThere are two major methodologies for automatic duplicate bug report detection (ADBRD):

show abstract

Section: Information Retrieval (Ir)-based Methodology Of Automatic Duplicate Bug Report Detection (Adbrd)mentioning

confidence: 99%