Yuanting Yan scite author profile

The complex language of eukaryotic gene expression remains incompletely understood. Despite the importance suggested by many proteins variants statistically associated with human disease, nearly all such variants have unknown mechanisms, for example, protein-protein interactions (PPIs). In this study, we address this challenge using a recent machine learning advance-deep neural networks (DNNs). We aim at improving the performance of PPIs prediction and propose a method called DeepPPI (Deep neural networks for Protein-Protein Interactions prediction), which employs deep neural networks to learn effectively the representations of proteins from common protein descriptors. The experimental results indicate that DeepPPI achieves superior performance on the test data set with an Accuracy of 92.50%, Precision of 94.38%, Recall of 90.56%, Specificity of 94.49%, Matthews Correlation Coefficient of 85.08% and Area Under the Curve of 97.43%, respectively. Extensive experiments show that DeepPPI can learn useful features of proteins pairs by a layer-wise abstraction, and thus achieves better prediction performance than existing methods. The source code of our approach can be available via http://ailab.ahu.edu.cn:8087/DeepPPI/index.html .

show abstract

A Parameter-Free Cleaning Method for SMOTE in Imbalanced Classification

Yan

Liu

Ding

et al. 2019

IEEE Access

View full text Add to dashboard Cite

Oversampling is an efficient technique in dealing with class-imbalance problem. It addresses the problem by reduplicating or generating the minority class samples to balance the distribution between the samples of the majority and the minority class. Synthetic minority oversampling technique (SMOTE) is one of the typical representatives. During the past decade, researchers have proposed many variants of SMOTE. However, the existing oversampling methods may generate wrong minority class samples in some scenarios. Furthermore, how to effectively mine the inherent complex characteristics of imbalanced data remains a challenge. To this end, this paper proposes a parameter-free data cleaning method to improve SMOTE based on constructive covering algorithm. The dataset generated by SMOTE is first partitioned into a group of covers, then the hard-to-learn samples can be detected based on the characteristics of sample space distribution. Finally, a pair-wise deletion strategy is proposed to remove the hard-to-learn samples. The experimental results on 25 imbalanced datasets show that our proposed method is superior to the comparison methods in terms of various metrics, such as F-measure, G-mean, and Recall. Our method not only can reduce the complexity of the dataset but also can improve the performance of the classification model. INDEX TERMSImbalanced data, SMOTE, oversampling, constructive covering algorithm, data cleaning.

show abstract

A selective neural network ensemble classification for incomplete data

Yan

Zhang

et al. 2016

Int. J. Mach. Learn. & Cyber.

View full text Add to dashboard Cite

Neighborhood-aware web service quality prediction using deep learning

Jin

Wang

Zhang

et al. 2019

J Wireless Com Network

View full text Add to dashboard Cite

With the rapid growth of web services on the Internet, it becomes more difficult for users who want to choose the high-quality web services from a large number of functionally equivalent candidate services. Therefore, the prediction of quality of service (QoS) values according to the history of web services has received extensive attention. In recent years, deep learning has achieved great success in speech recognition, image processing, and natural language understanding. However, it is rarely applied to the service recommendation field. Therefore, a novel approach for QoS prediction named NDL (neighborhood-aware deep learning) is proposed. NDL first gets the Top-k neighbors of the user and the service through the Pearson correlation coefficient according to the service QoS information. Then, it extracts the potential features of the user neighbor and the service neighbor; after that, it inputs the QoS values of the user and the user neighbor as well as the QoS values of the service and service neighbors as a convolutional neural network. The results of experiments conducted on a real-world dataset demonstrate that the NDL significantly outperforms the current QoS prediction method in prediction accuracy.

show abstract

Identification and Analysis of Cancer Diagnosis Using Probabilistic Classification Vector Machines with Feature Selection

Wen

et al. 2018

CBIO

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Yuanting Yan

DeepPPI: Boosting Prediction of Protein–Protein Interactions with Deep Neural Networks

A Parameter-Free Cleaning Method for SMOTE in Imbalanced Classification

A selective neural network ensemble classification for incomplete data

Neighborhood-aware web service quality prediction using deep learning

Identification and Analysis of Cancer Diagnosis Using Probabilistic Classification Vector Machines with Feature Selection

Contact Info

Product

Resources

About