Proteins carry out the most fundamental processes of life such as cellular metabolism, regulation, and communication. Understanding these processes at a molecular level requires knowledge of their three-dimensional structures. Experimental techniques such as X-ray crystallography, NMR spectroscopy, and cryogenic electron microscopy can resolve protein structures but are costly and time-consuming and do not work for all proteins. Computational protein structure prediction tries to overcome these problems by predicting the structure of a new protein using existing protein structures as a resource. Here we present TopSuite, a web server for protein model quality assessment (TopScore) and template-based protein structure prediction (TopModel). TopScore provides meta-predictions for global and residue-wise model quality estimation using deep neural networks. TopModel predicts protein structures using a top-down consensus approach to aid the template selection and subsequently uses TopScore to refine and assess the predicted structures. The TopSuite Web server is freely available at https://cpclab.uni-duesseldorf.de/topsuite/.
Protein domains are independent, functional, and stable structural units of proteins. Accurate protein domain boundary prediction plays an important role in understanding protein structure and evolution, as well as for protein structure prediction. Current domain boundary prediction methods differ in terms of boundary definition, methodology, and training databases resulting in disparate performance for different proteins. We developed TopDomain, an exhaustive metapredictor, that uses deep neural networks to combine multisource information from sequence- and homology-based features of over 50 primary predictors. For this purpose, we developed a new domain boundary data set termed the TopDomain data set, in which the true annotations are informed by SCOPe annotations, structural domain parsers, human inspection, and deep learning. We benchmark TopDomain against 2484 targets with 3354 boundaries from the TopDomain test set and achieve F1 scores of 78.4% and 73.8% for multidomain boundary prediction within ±20 residues and ±10 residues of the true boundary, respectively. When examined on targets from CASP11-13 competitions, TopDomain achieves F1 scores of 47.5% and 42.8% for multidomain proteins. TopDomain significantly outperforms 15 widely used, state-of-the-art ab initio and homology-based domain boundary predictors. Finally, we implemented TopDomainTMC, which accurately predicts whether domain parsing is necessary for the target protein.
Transmembrane proteins (TMPs) are critical components of cellular life. However, due to experimental challenges, the number of experimentally resolved TMP structures is severely underrepresented in databases compared to their cellular abundance. Prediction of (per-residue) features such as transmembrane topology, membrane exposure, secondary structure, and solvent accessibility can be a useful starting point for experimental design or protein structure prediction but often requires different computational tools for different features or types of proteins. We present TopProperty, a metapredictor that predicts all of these features for TMPs or globular proteins. TopProperty is trained on datasets without bias toward a high number of sequence homologs, and the predictions are significantly better than the evaluated state-of-theart primary predictors on all quality metrics. TopProperty eliminates the need for protein type-or feature-tailored tools, specifically for TMPs. TopProperty is freely available as a web server and standalone at https://cpclab.uni-duesseldorf.de/topsuite/.
Background / Rationale: The phosphatidylcholine floppase MDR3 is an essential hepatobiliary transport protein. MDR3 dysfunction is associated with various liver diseases, ranging from severe progressive familial intrahepatic cholestasis to transient forms of intrahepatic cholestasis of pregnancy and familial gallstone disease. Single amino acid substitutions are often found as causative of dysfunction, but identifying the substitution effect in in vitro studies is time- and cost-intensive. Main results: We developed Vasor (Variant assessor of MDR3), a machine learning-based model to classify novel MDR3 missense variants into the categories benign or pathogenic. Vasor was trained on the, to date, largest dataset specific for MDR3 of benign and pathogenic variants and uses general predictors, namely EVE, EVmutation, PolyPhen-2, I-Mutant2.0, MUpro, MAESTRO, PON-P2, and other variant properties such as half-sphere exposure, PTM site, and secondary structure disruption as input. Vasor consistently outperformed the integrated general predictors and the external prediction tool MutPred2, leading to the current best prediction performance for MDR3 single-site missense variants (on an external test set: F1-score: 0.90, MCC: 0.80). Furthermore, Vasor predictions cover the entire sequence space of MDR3. Vasor is accessible as a webserver at https://cpclab.uni-duesseldorf.de/mdr3_predictor/ for users to rapidly obtain prediction results and a visualization of the substitution site within the MDR3 structure. Conclusion: The MDR3-specific prediction tool Vasor can provide reliable predictions of single site amino acid substitutions, giving users a fast way to assess initially whether a variant is benign or pathogenic.
No abstract
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.