Human Leukocyte Antigen (HLA) is a type of molecule residing on the surfaces of most human cells and exerts an essential role in the immune system responding to the invasive items. The T cell antigen receptors may recognize the HLA-peptide complexes on the surfaces of cancer cells and destroy these cancer cells through toxic T lymphocytes. The computational determination of HLA-binding peptides will facilitate the rapid development of cancer immunotherapies. This study hypothesized that the natural language processing-encoded peptide features may be further enriched by another deep neural network. The hypothesis was tested with the Bi-directional Long Short-Term Memory-extracted features from the pretrained Protein Bidirectional Encoder Representations from Transformers-encoded features of the class I HLA (HLA-I)-binding peptides. The experimental data showed that our proposed HLAB feature engineering algorithm outperformed the existing ones in detecting the HLA-I-binding peptides. The extensive evaluation data show that the proposed HLAB algorithm outperforms all the seven existing studies on predicting the peptides binding to the HLA-A*01:01 allele in AUC and achieves the best average AUC values on the six out of the seven k-mers (k=8,9,...,14, respectively represent the prediction task of a polypeptide consisting of k amino acids) except for the 9-mer prediction tasks. The source code and the fine-tuned feature extraction models are available at http://www.healthinformaticslab.org/supp/resources.php.
This study focused on the topic of predicting ''proactive personality''. With 901 participants selected by cluster sampling method, targeted short-answer questions text and participants' social media post text (Weibo) were obtained while participants' labels of proactive personality were evaluated by experts. In order to make classification, five machine learning algorithms included Support Vector Machine (SVM), XGBoost, K-Nearest-Neighbors (KNN), Naive Bayes (NB) and Logistic Regression (LR) were deployed. Seven different indicators, which include Accuracy (ACC), F1-score (F1), Sensitivity (SEN), Specificity (SPE), Positive Predictive Value (PPV), Negative Predictive Value (NPV) and Area under Curve (AUC), combined with hierarchical cross-validation were also used to make the comprehensive evaluation of models. With participants' Weibo text and short-answer questions text, we proposed a new approach to classify individuals' proactive personality based on text mining technology. The results showed that short-answer questions + Weibo text datasets had the best performance, followed by short-answer questions text datasets, while the outcome of Weibo text datasets were the worst. However, it is noteworthy that Weibo text has the highest average score on the SPE, which indicated that Weibo text played an important role in identifying individuals with low proactive personality. With Weibo text, SEN was also improved compared with only applying short-answer questions text. In addition, among all three datasets, the indicator SPE is always higher than SEN, indicating this text classification approach was more competent for identifying college students with low proactive personality. As for algorithms, Support Vector Machine and Logistic Regression showed steadier performance compared with other algorithms.INDEX TERMS Machine learning, proactive personality, text mining.
Recently, combined quantitative and qualitative analysis has become popular for research. In studying careers, subjective and objective information are ideal for assessing individual career development and are relevant in career counseling. This paper measures career adaptability by combining text mining and item response theory (IRT), with college students' self-reported career adaptability as a subjective measure and responses to questionnaire items as an objective measure. The two are combined under a Bayesian framework. Additionally, the validity of text categorization and IRT, combined with model measurement, were explored; text categorization results were used as prior information when estimating IRT capability parameters to test whether adding prior information can improve accuracy. This study draws the following conclusions: (1) The text classification method had the highest sensitivity in 300-person samples; however, the text-IRT method had the best predictive effect, high reliability, and unique advantages in accuracy. (2) In 600-person samples, the text classification method had the best predictive effect. The effect was relatively good, with unique advantages in identifying low career adaptability. However, this must be selected according to actual needs. If the accuracy requirement is high and sensitivity can be sacrificed, the text-IRT method is more appropriate. (3) The text-IRT method is more suitable for 900 subjects when accuracy, sensitivity, and specificity need to be considered, and text classification is best when identifying low career adaptability. (4) Sample size influenced accuracy, specificity, and the negative predictive values of text classification, as well as the sensitivity of IRT and text-IRT methods.INDEX TERMS Career adaptability, item response theory, text mining.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.