Ruihang Gu scite author profile

Ruihang Gu

3Publications

167Citation Statements Received

146Citation Statements Given

How they've been cited

280

166

How they cite others

140

Affiliations

Nanjing University of Aeronautics and Astronautics

Publications

Order By: Most citations

Combining Text Mining and Data Mining for Bug Report Classification

Zhou

Tong

et al. 2014

View full text Add to dashboard Cite

Misclassification of bug reports inevitably sacrifices the performance of bug prediction models. Manual examinations can help reduce the noise but bring a heavy burden for developers instead. In this paper, we propose a hybrid approach by combining both text mining and data mining techniques of bug report data to automate the prediction process. The first stage leverages text mining techniques to analyze the summary parts of bug reports and classifies them into three levels of probability. The extracted features and some other structured features of bug reports are then fed into the machine learner in the second stage. Data grafting techniques are employed to bridge the two stages. Comparative experiments with previous studies on the same data-three large-scale open source projectsconsistently achieve a reasonable enhancement (from 77.4% to 81.7%, 73.9% to 80.2% and 87.4% to 93.7%, respectively) over their best results in terms of overall performance. Additional comparative empirical experiments on other two popular open source repositories confirm the findings and demonstrate the benefits of our approach.

show abstract

Analyzing APIs Documentation and Code to Detect Directive Defects

Zhou

Chen

et al. 2017

View full text Add to dashboard Cite

Abstract-Application Programming Interface (API) documents represent one of the most important references for API users. However, it is frequently reported that the documentation is inconsistent with the source code and deviates from the API itself. Such inconsistencies in the documents inevitably confuse the API users hampering considerably their API comprehension and the quality of software built from such APIs. In this paper, we propose an automated approach to detect defects of API documents by leveraging techniques from program comprehension and natural language processing. Particularly, we focus on the directives of the API documents which are related to parameter constraints and exception throwing declarations. A first-order logic based constraint solver is employed to detect such defects based on the obtained analysis results. We evaluate our approach on parts of well documented JDK 1.8 APIs. Experiment results show that, out of around 2000 API usage constraints, our approach can detect 1146 defective document directives, with a precision rate of 81.6%, and a recall rate of 82.0%, which demonstrates its practical feasibility.

show abstract

Combining text mining and data mining for bug report classification

Zhou

Tong

et al. 2016

J. Softw. Evol. and Proc.

103

View full text Add to dashboard Cite

Bug reports represent an important information source for software construction. Misclassification of these reports inevitably introduces bias. Manual examinations can help reduce the noise, but bring a heavy burden for developers instead. In this paper, we propose a multi‐stage approach by combining both text mining and data mining techniques to automate the prediction process. The first stage leverages text mining techniques to analyze the summary parts of bug reports and classifies them into three levels of probability. The extracted features and some other structured features of bug reports are then fed into the machine learner in the second stage. Data grafting techniques are employed to bridge the two stages. Comparative experiments with previous studies on the same data—three large‐scale open‐source projects—consistently achieve a reasonable enhancement (from 77.4% to 81.7%, 76.1% to 81.6%, and 87.4% to 93.7%, respectively) over their best results in terms of overall performance. Additional comparative empirical experiments on other seven popular open‐source systems confirm the findings. Moreover, based on the data obtained, we also empirically studied the impact relation between the underlying classifiers and various other properties of the combined model. A prototypical recommender system has been developed to demonstrate the applicability of our approach. Copyright © 2016 John Wiley & Sons, Ltd.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Ruihang Gu

Combining Text Mining and Data Mining for Bug Report Classification

Analyzing APIs Documentation and Code to Detect Directive Defects

Combining text mining and data mining for bug report classification

Contact Info

Product

Resources

About