Many computational approaches exist for predicting the effects of amino acid substitutions. Here, we considered whether the protein sequence position class – rheostat or toggle – affects these predictions. The classes are defined as follows: experimentally evaluated effects of amino acid substitutions at toggle positions are binary, while rheostat positions show progressive changes. For substitutions in the LacI protein, all evaluated methods failed two key expectations: toggle neutrals were incorrectly predicted as more non-neutral than rheostat non-neutrals, while toggle and rheostat neutrals were incorrectly predicted to be different. However, toggle non-neutrals were distinct from rheostat neutrals. Since many toggle positions are conserved, and most rheostats are not, predictors appear to annotate position conservation better than mutational effect. This finding can explain the well-known observation that predictors assign disproportionate weight to conservation, as well as the field’s inability to improve predictor performance. Thus, building reliable predictors requires distinguishing between rheostat and toggle positions.
Evaluating the impact of non-synonymous genetic variants is essential for uncovering disease associations and mechanisms of evolution. An in-depth understanding of sequence changes is also fundamental for synthetic protein design and stability assessments. However, the variant effect predictor performance gain observed in recent years has not kept up with the increased complexity of new methods. One likely reason for this might be that most approaches use similar sets of gene and protein features for modeling variant effects, often emphasizing sequence conservation. While high levels of conservation highlight residues essential for protein activity, much of the variation observable in vivo is arguably weaker in its impact, thus requiring evaluation at a higher level of resolution. Here, we describe functionNeutral/Toggle/Rheostatpredictor (funtrp), a novel computational method that categorizes protein positions based on the position-specific expected range of mutational impacts: Neutral (weak/no effects), Rheostat (function-tuning positions), or Toggle (on/off switches). We show that position types do not correlate strongly with familiar protein features such as conservation or protein disorder. We also find that position type distribution varies across different protein functions. Finally, we demonstrate that position types can improve performance of existing variant effect predictors and suggest a way forward for the development of new ones.
The internal trematodes from fish in the late Dr. Stafford's slide collection have been restudied and described. Gasterostomum armatum of Stafford is identified with Prosorhynchus squamatus Ohdner, 1905, and Crepidostomum laureatum with C. cooperi Hopkins, 1932. Homalometron pallidum Stafford, 1904, is redescribed and it is suggested that the genus Homalometron Stafford, 1904, is synonymous with Lepocreadium Stossich, 1904. Neophasis pusilla Stafford, 1904, and Stenakron vetustum Stafford, 1904, are redescribed and assigned to the family Allocreadiidae. Hemiurus appendiculatus of Stafford, is demonstrated to represent H. levinseni Ohdner, 1905, and Brachyphallus crenatus Rudolphi, 1802, while Fellodistomum incisum of Stafford represents the species F. fellis Olsson, 1868, and F. agnotum Nicoll, 1909. Species of the genera Azygia and Otodistomum are reidentified. Protenteron diaphanum Stafford, 1904, is redescribed and the species is referred to the genus Cryptogonimus Osborn, 1903, the new combination being Cryptogonimus diaphanus.
The vast majority of microorganisms on Earth reside in often-inseparable environment-specific communities—microbiomes. Meta-genomic/-transcriptomic sequencing could reveal the otherwise inaccessible functionality of microbiomes. However, existing analytical approaches focus on attributing sequencing reads to known genes/genomes, often failing to make maximal use of available data. We created faser (functional annotation of sequencing reads), an algorithm that is optimized to map reads to molecular functions encoded by the read-correspondent genes. The mi-faser microbiome analysis pipeline, combining faser with our manually curated reference database of protein functions, accurately annotates microbiome molecular functionality. mi-faser’s minutes-per-microbiome processing speed is significantly faster than that of other methods, allowing for large scale comparisons. Microbiome function vectors can be compared between different conditions to highlight environment-specific and/or time-dependent changes in functionality. Here, we identified previously unseen oil degradation-specific functions in BP oil-spill data, as well as functional signatures of individual-specific gut microbiome responses to a dietary intervention in children with Prader–Willi syndrome. Our method also revealed variability in Crohn's Disease patient microbiomes and clearly distinguished them from those of related healthy individuals. Our analysis highlighted the microbiome role in CD pathogenicity, demonstrating enrichment of patient microbiomes in functions that promote inflammation and that help bacteria survive it.
Motivation:Evaluating the impact of non-synonymous genetic variants is essential for uncovering disease associations. Understanding the corresponding changes in protein sequences can also help with synthetic protein design and stability assessments. Even though hundreds of computational approaches addressing this task exist, and more are being developed, there has been little improvement in their performance in the recent years. One of the likely reasons for this lack of progress might be that most approaches use similar sets of gene/protein features for model development, with great emphasis being placed on sequence conservation. While high levels of conservation clearly highlight residues essential for protein activity, much of the in vivo observable variation is arguably weaker in its impact and, thus, requires evaluation of a higher level of resolution. Results: Here we describe function Neutral/Toggle/Rheostat predictor (funtrp), a novel computational method that classifies protein positions by type based on the expected range of mutational impacts at that position: Neutral (most mutations have no or weak effects), Rheostat (range of effects; i.e. functional tuning), or Toggle (mostly strong effects). Three conclusions of our work are most salient. We show that our position types do not correlate strongly with the familiar protein features such as conservation or protein disorder. Moreover, we find that position type distribution varies across different enzyme classes. Finally, we demonstrate that position types reflect experimentally derived functional effects, improving performance of existing variant effect predictors and suggesting a way forward for the development of new ones. Availability: https://services.bromberglab.org/funtrp; Git: https://bitbucket.org/bromberglab/funtrp/ Contact: mmiller@bromberglab.org Supplementary information: Supplementary data are available online.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.