The coordinated recognition of virus-derived T cell epitopes and MHC molecules by T cells plays a pivotal role in cellular immunity-mediated virus clearance. It has been demonstrated that the conformation of MHC class I (MHC I) molecules can be adjusted by the presented peptide, which impacts T cell activation. However, it is still largely unknown whether the conformational shift of MHC I influences the protective effect of virus-specific T cells. In this study, utilizing the Middle East respiratory syndrome coronavirus-infected mouse model, we observed that through the unusual secondary anchor Ile5, a CD8 T cell epitope drove the conformational fit of Trp on the α1 helix of murine MHC I H-2K In vitro renaturation and circular dichroism assays indicated that this shift of the structure did not influence the peptide/MHC I binding affinity. Nevertheless, the T cell recognition and the protective effect of the peptide diminished when we made an Ile to Ala mutation at position 5 of the original peptide. The molecular bases of the concordant recognition of T cell epitopes and host MHC-dependent protection were demonstrated through both crystal structure determination and tetramer staining using the peptide-MHC complex. Our results indicate a coordinated MHC I/peptide interaction mechanism and provide a beneficial reference for T cell-oriented vaccine development against emerging viruses such as Middle East respiratory syndrome coronavirus.
BackgroundLysine succinylation is a new kind of post-translational modification which plays a key role in protein conformation regulation and cellular function control. To understand the mechanism of succinylation profoundly, it is necessary to identify succinylation sites in proteins accurately. However, traditional methods, experimental approaches, are labor-intensive and time-consuming. Computational prediction methods have been proposed recent years, and they are popular because of their convenience and high speed. In this study, we developed a new method to predict succinylation sites in protein combining multiple features, including amino acid composition, binary encoding, physicochemical property and grey pseudo amino acid composition, with a feature selection scheme (information gain). And then, it was trained using SVM (Support Vector Machine) and an ensemble learning algorithm.ResultsThe performance of this method was measured with an accuracy of 89.14% and a MCC (Matthew Correlation Coefficient) of 0.79 using 10-fold cross validation on training dataset and an accuracy of 84.5% and a MCC of 0.2 on independent dataset.ConclusionsThe conclusions made from this study can help to understand more of the succinylation mechanism. These results suggest that our method was very promising for predicting succinylation sites. The source code and data of this paper are freely available athttps://github.com/ningq669/PSuccE.Electronic supplementary materialThe online version of this article (10.1186/s12859-018-2249-4) contains supplementary material, which is available to authorized users.
BackgroundUbiquitination, which is also called “lysine ubiquitination”, occurs when an ubiquitin is attached to lysine (K) residues in targeting proteins. As one of the most important post translational modifications (PTMs), it plays the significant role not only in protein degradation, but also in other cellular functions. Thus, systematic anatomy of the ubiquitination proteome is an appealing and challenging research topic. The existing methods for identifying protein ubiquitination sites can be divided into two kinds: mass spectrometry and computational methods. Mass spectrometry-based experimental methods can discover ubiquitination sites from eukaryotes, but are time-consuming and expensive. Therefore, it is priority to develop computational approaches that can effectively and accurately identify protein ubiquitination sites.ResultsThe existing computational methods usually require feature engineering, which may lead to redundancy and biased representations. While deep learning is able to excavate underlying characteristics from large-scale training data via multiple-layer networks and non-linear mapping operations. In this paper, we proposed a deep architecture within multiple modalities to identify the ubiquitination sites. First, according to prior knowledge and biological knowledge, we encoded protein sequence fragments around candidate ubiquitination sites into three modalities, namely raw protein sequence fragments, physico-chemical properties and sequence profiles, and designed different deep network layers to extract the hidden representations from them. Then, the generative deep representations corresponding to three modalities were merged to build the final model. We performed our algorithm on the available largest scale protein ubiquitination sites database PLMD, and achieved 66.4% specificity, 66.7% sensitivity, 66.43% accuracy, and 0.221 MCC value. A number of comparative experiments also indicated that our multimodal deep architecture outperformed several popular protein ubiquitination site prediction tools.ConclusionThe results of comparative experiments validated the effectiveness of our deep network and also displayed that our method outperformed several popular protein ubiquitination site prediction tools. The source codes of our proposed method are available at https://github.com/jiagenlee/deepUbiquitylation.
Protein pupylation is a type of post-translation modification, which plays a crucial role in cellular function of bacterial organisms in prokaryotes. To have a better insight of the mechanisms underlying pupylation an initial, but important, step is to identify pupylation sites. To date, several computational methods have been established for the prediction of pupylation sites which usually artificially design the negative samples using the verified pupylation proteins to train the classifiers. However, if this process is not properly done it can affect the performance of the final predictor dramatically. In this work, different from previous computational methods, we proposed an enhanced positive-unlabeled learning algorithm (EPuL) to the pupylation site prediction problem, which uses only positive and unlabeled samples. Firstly, we separate the training dataset into the positive dataset and the unlabeled dataset which contains the remaining non-annotated lysine residues. Then, the EPuL algorithm is utilized to select the reliably negative initial dataset and then iteratively pick out the non-pupylation sites. The performance of the proposed method was measured with an accuracy of 90.24%, an Area Under Curve (AUC) of 0.93 and an MCC of 0.81 by 10-fold cross-validation. A user-friendly web server for predicting pupylation sites was developed and was freely available at .
Glycation is a non-enzymatic process occurring inside or outside the host body by attaching a sugar molecule to a protein or lipid molecule. It is an important form of post-translational modification (PTM), which impairs the function and changes the characteristics of the proteins so that the identification of the glycation sites may provide some useful guidelines to understand various biological functions of proteins. In this study, we proposed an accurate prediction tool, named Glypre, for lysine glycation. Firstly, we used multiple informative features to encode the peptides. These features included the position scoring function, secondary structure, AAindex, and the composition of k-spaced amino acid pairs. Secondly, the distribution of distinctive features of the residues surrounding the glycation and non-glycation sites was statistically analysed. Thirdly, based on the distribution of these features, we developed a new predictor by using different optimal window sizes for different properties and a two-step feature selection method, which utilized the maximum relevance minimum redundancy method followed by a greedy feature selection procedure. The performance of Glypre was measured with a sensitivity of 57.47%, a specificity of 90.78%, an accuracy of 79.68%, area under the receiver-operating characteristic (ROC) curve (AUC) of 0.86, and a Matthews’s correlation coefficient (MCC) of 0.52 by 10-fold cross-validation. The detailed analysis results showed that our predictor may play a complementary role to other existing methods for identifying protein lysine glycation. The source code and datasets of the Glypre are available in the Supplementary File.
Backgroud:Suppressors of cytokine signaling (SOCS) family play important roles in the development of cancers by inhibiting the transmission of the Janus kinases–signal transducers and activators of transcription (JAK-STAT) signaling pathway. However, the expression patterns and prognostic value of SOCS family genes in non-small cell lung cancer (NSCLC) remains unclear. Methods: The SOCS family genes expression profiles were explored using ONCOMINE and GEPIA online tools. The mutation and copy number alterations of SOCS family genes in NSCLC were assessed by cBioportal for Cancer Genomics. The methylation status of SOCS family members were analyzed through MEXPRESS and UCSC Xena website. The prognostic values of SOCS family genes in NSCLC were explored through Kaplan-Meier Plotter database. Results: The expression levels of SOCS2, SOCS3, and cytokine-inducible SH2-containing protein (CIS/CISH) were significantly reduced in NSCLC tissues compared to normal lung tissues. The aberrant DNA methylation of SOCS family genes were frequent in NSCLC. CISH methylation was negatively correlated with gene expression in NSCLC. The Kaplan-Meier Plotter analysis demonstrated high expression of SOCS1 may be a predictor of poor prognosis in lung adenocarcinoma(LUAD) but served as a favorable prognostic marker of lung squamous cell carcinoma. The high expression levels of SOCS2 and SOCS4-7 were significantly correlated with better overall survival (OS) in LUAD but not in lung squamous carcinoma (LUSC) patients. Conclusions:Our findings indicated that the aberrant gene expression and DNA methylation of SOCS family members are common in NSCLC and contribute to tumorigenesis. SOCS family genes may serve as therapeutic targets and prognostic biomarkers for NSCLC patients
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.