Y. Xin scite author profile

Y. Xin

3Publications

4Citation Statements Received

6Citation Statements Given

How they've been cited

How they cite others

Affiliations

Xi'an University of Technology, Pittsburgh Supercomputing Center, Wuhan University of Science and Technology

Publications

Order By: Most citations

A diabetes prediction model based on Boruta feature selection and ensemble learning

Zhou

Xin

2023

BMC Bioinformatics

View full text Add to dashboard Cite

Background and objective As a common chronic disease, diabetes is called the “second killer” among modern diseases. Currently, there is no medical cure for diabetes. We can only rely on medication for auxiliary treatment. However, many diabetic patients still die each year. In addition, a considerable number of people do not pay attention to their physical health or opt out of treatment due to lack of money, which eventually leads to various complications. Therefore, diagnosing diabetes at an early stage and intervening early is necessary; thus, developing an early detection method for diabetes is essential. Methods In this study, a diabetes prediction model based on Boruta feature selection and ensemble learning is proposed. The model contains the use of Boruta feature selection, the extraction of salient features from datasets, the use of the K-Means++ algorithm for unsupervised clustering of data and stacking of an ensemble learning method for classification. It has been validated on a diabetes dataset. Results The experiments were performed on the PIMA Indian diabetes dataset. The model was evaluated by accuracy, precision and F1 index. The obtained results show that the accuracy rate of the model reaches 98% and achieves good results. Conclusion Compared with other diabetes prediction models, this model achieved better results, and the obtained results indicate that this model is superior to other models in diabetes prediction and has better performance.

show abstract

Prediction of protein folding using the shift-learning method with a large scale neural network

Poliac

Wilcox

Xin

et al. 1991

View full text Add to dashboard Cite

encoding the association between protein sequence and three dimensional structure for a small heterologous training set of small proteins. In the present study, we report the application of this approach to a selected homologous training set of 8 proteins using the Cray 2 supercompter at the Minnesota Supercomputer Center, Minneapolis USA. The large memory of this machine allowed us to configure a network with more than .3 million connections and 30.000 neural units; a network of this size was necessary to accommodate a new training/testing set with 8 proteins of up to 140 amino acid residues. This training set was constructed to investigate the performance of the neural network approach in prediction of structures within the protease class of proteins; proteases are enzymes which cleave the peptide bonds which join individual amino acid residues of other proteins. The network learned the sequence-structure association for 4 of the proteins within 100 iterations selected in a random order and shifted by a random offset to the left or to the right. When presented with novel sequences from related proteins, the network was able to predict three dimensional structures of the four proteins in the testing set. The results of this study suggest that a neural network trained to recognize the entire sequence of a protein using the shift-learn method can retain some of the rules of protein folding in a form which allows prediction of three dimensional structures. Our findings indicate that large scalar or vector supercomputer architectures are ideal for implementation of useful backpropagation neural networks.

show abstract

An improved Chinese text multi-label classification method based on CNN

Xin

Zhi

2020

J. Phys.: Conf. Ser.

View full text Add to dashboard Cite

Text multi-label classification technology can accurately and quickly classify text information into related categories or topics, and help people quickly locate the required content in massive information resources, which is of great significance in application. As the traditional classification algorithm is faced with the problems of low classification accuracy due to the low correlation of data labels, unbalanced label data and few short text feature words, this paper firstly performs hierarchical pre-processing on label data to transform multi-label classification into hierarchical text multi-classification. At the same time, an improved multi-label classification algorithm Multi-label Convolutional Neural Networks (ML-CNN) is proposed. Based on the TensorFlow framework, a CNN model is designed and different training models are constructed for each level of label classification. According to the number of classification levels, the output of the upper level label is stitched to the original input tail as the next level of input. Experiments on the description information of 500,000 Chinese products with labels, show that the improved algorithm will significantly improve the classification accuracy and the accuracy of each level can reach more than 88%, which proves the feasibility and effectiveness of the algorithm.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Y. Xin

A diabetes prediction model based on Boruta feature selection and ensemble learning

Prediction of protein folding using the shift-learning method with a large scale neural network

An improved Chinese text multi-label classification method based on CNN

Contact Info

Product

Resources

About