Anh Viet Phan scite author profile

Detecting defective source code to localize and fix bugs is important to reduce software development efforts. Although deep learning models have made a breakthrough in this field, many issues have not been resolved, such as labeled data shortage and the small size of defective elements. Given two similar programs that differ from each other by an operator or statement, one may be clean while the other may be defective. To address these issues, this study proposes a new deep learning model to facilitate the learning of distinguishing features. The model comprises of three main components: 1) a convolutional neural network-based classifier, 2) an autoencoder, and 3) a k-means cluster. In our model, the autoencoder assists the classifier in generating program latent representations. The k-means cluster provides penalty functions to increase the distinguishability among latent representations. We evaluated the effectiveness of the model according to performance metrics and latent representation quality. The experimental results on the four defect prediction datasets show that the proposed model outperforms the baselines thanks to the generation of sophisticated features.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Anh Viet Phan

Feature weighting and SVM parameters optimization based on genetic algorithms for classification problems

DGCNN: A convolutional neural network over large-scale labeled graphs

Convolutional Neural Networks over Control Flow Graphs for Software Defect Prediction

Convolutional neural networks on assembly code for predicting software defects

Automatically classifying source code using tree-based approaches

Exploiting tree structures for classifying programs by functionalities

Deep learning and sub-tree mining for document level sentiment classification

Learning Stretch-Shrink Latent Representations With Autoencoder and K-Means for Software Defect Prediction

Contact Info

Product

Resources

About