BackgroundThe Critical Assessment of Functional Annotation (CAFA) is an ongoing, global, community-driven effort to evaluate and improve the computational annotation of protein function.ResultsHere, we report on the results of the third CAFA challenge, CAFA3, that featured an expanded analysis over the previous CAFA rounds, both in terms of volume of data analyzed and the types of analysis performed. In a novel and major new development, computational predictions and assessment goals drove some of the experimental assays, resulting in new functional annotations for more than 1000 genes. Specifically, we performed experimental whole-genome mutation screening in Candida albicans and Pseudomonas aureginosa genomes, which provided us with genome-wide experimental data for genes associated with biofilm formation and motility. We further performed targeted assays on selected genes in Drosophila melanogaster, which we suspected of being involved in long-term memory.ConclusionWe conclude that while predictions of the molecular function and biological process annotations have slightly improved over time, those of the cellular component have not. Term-centric prediction of experimental annotations remains equally challenging; although the performance of the top methods is significantly better than the expectations set by baseline methods in C. albicans and D. melanogaster, it leaves considerable room and need for improvement. Finally, we report that the CAFA community now involves a broad range of participants with expertise in bioinformatics, biological experimentation, biocuration, and bio-ontologies, working together to improve functional annotation, computational function prediction, and our ability to manage big data in the era of large experimental screens.
The Critical Assessment of Functional Annotation (CAFA) is an ongoing, global, community-driven effort to evaluate and improve the computational annotation of protein function. Here we report on the results of the third CAFA challenge, CAFA3, that featured an expanded analysis over the previous CAFA rounds, both in terms of volume of data analyzed and the types of analysis performed. In a novel and major new development, computational predictions and assessment goals drove some of the experimental assays, resulting in new functional annotations for more than 1000 genes. Specifically, we performed experimental whole-genome mutation screening in Candida albicans and Pseudomonas aureginosa genomes, which provided us with genome-wide experimental data for genes associated with biofilm formation and motility (P. aureginosa only). We further performed targeted assays on selected genes in Drosophila melanogaster, which we suspected of being involved in long-term memory. We conclude that, while predictions of the molecular function and biological process annotations have slightly improved over time, those of the cellular component have not. Term-centric prediction of experimental annotations remains equally challenging; although the performance of the top methods is significantly better than expectations set by baseline methods in C. albicans and D. melanogaster, it leaves considerable room and need for improvement. We finally report that the CAFA community now involves a broad range of participants with expertise in bioinformatics, biological experimentation, biocuration, and bioontologies, working together to improve functional annotation, computational function prediction, and our ability to manage big data in the era of large experimental screens. 157 project. Predicting GO terms for a protein (protein-centric) and predicting which proteins are associated 158 with a given function (term-centric) are related but different computational problems: the former is a 159 multi-label classification problem with a structured output, while the latter is a binary classification task. 160Predicting the results of a genome-wide screen for a single or a small number of functions fits the term-centric 161 formulation. To see how well all participating CAFA methods perform term-centric predictions, we mapped 162 results from the protein-centric CAFA3 methods onto these terms. In addition we held a separate CAFA 163 challenge, CAFA-π whose purpose was to attract additional submissions from algorithms that specialize in 164 term-centric tasks. 165 We performed screens for three functions in three species, which we then used to assess protein function 166 prediction. In the bacterium Pseudomonas aeruginosa and the fungus Candida albicans we performed 167 genome-wide screens capable of uncovering genes with two functions, biofilm formation (GO:0042710) and 168 motility (for P. aeruginosa only) (GO:0001539), as described in Methods. In Drosophila melanogaster we 169 performed targeted assays, guided by previous CAFA submissions, of a ...
Three Gram-stain-positive, non-spore-forming, microaerophilic and fructose-6-phosphate phosphoketolase positive strains were isolated from a faecal sample of an adult subject of the emperor tamarin (Saguinus imperator). Given that the isolates revealed identical BOX PCR profiles, strain TRI 5 was selected as a representative and characterized further. Comparative analysis of 16S rRNA gene sequence similarity revealed that strain TRI 5 was closely related to Bifidobacterium saguini DSM 23967 (96.4 %) and to Bifidobacterium longum subsp. longum ATCC 15708 (96.2 %). Multilocus sequence analyses of five housekeeping genes showed the close phylogenetic relatedness of this strain to Bifidobacterium breve DSM 20213 (hsp60 94.1 %), Bifidobacterium saguini DSM 23967 (clpC 91 %), Bifidobacterium avesanii DSM 100685 (dnaG 80.3 %), Bifidobacterium longumsubsp. infantis ATCC 15697 (dnaJ 85.3 %) and Bifidobacterium longumsubsp. longum ATCC 15708 (rpoB 93 %), respectively. The peptidoglycan type was A3β, with an interpeptide bridge comprising l-Orn (Lys) - l-Ser - l-Ala - l-Thr - l-Ala. The DNA G+C content of strain TRI 5 was 60.9 mol%. Based on the data provided, strain TRI 5 represents a novel species of the genus Bifidobacterium for which the name Bifidobacteriumcallitrichidarum sp. nov. is proposed. The type strain is TRI 5 (=DSM 103152=JCM 31790).
The conformational landscape of a protein is constantly expanded by genetic variations that have a minimal impact on the function(s) while causing subtle effects on protein structure. The wider the conformational space sampled by these variants, the higher the probabilities to adapt to changes in environmental conditions. However, the probability that a single mutation may result in a pathogenic phenotype also increases. Here we present a paradigmatic example of how protein evolution balances structural stability and dynamics to maximize protein adaptability and preserve protein fitness. We took advantage of known genetic variations of human alanine:glyoxylate aminotransferase (AGT1), which is present as a common major allelic form (AGT‐Ma) and a minor polymorphic form (AGT‐Mi) expressed in 20% of Caucasian population. By integrating crystallographic studies and molecular dynamics simulations, we show that AGT‐Ma is endowed with structurally unstable (frustrated) regions, which become disordered in AGT‐Mi. An in‐depth biochemical characterization of variants from an anticonsensus library, encompassing the frustrated regions, correlates this plasticity to a fitness window defined by AGT‐Ma and AGT‐Mi. Finally, co‐immunoprecipitation analysis suggests that structural frustration in AGT1 could favor additional functions related to protein–protein interactions. These results expand our understanding of protein structural evolution by establishing that naturally occurring genetic variations tip the balance between stability and frustration to maximize the ensemble of conformations falling within a well‐defined fitness window, thus expanding the adaptability potential of the protein.
Articles you may be interested inA comparison of beamhardening artifacts in xray computerized tomography with gadolinium and iodine contrast agents Med.
Nucleobase-containing coenzymes are considered the relics of an early RNA-based world that preceded the emergence of protein domains. Despite the importance of coenzyme-protein synergisms, their emergence and evolution remain poorly understood. An excellent target to address this issue is the Rossman fold, the most catalytically diverse and abundant protein architecture in Nature. Here, we investigatedted the two largest Rossman lineages, namely the nicotinamide adenine dinucleotide phosphate (NAD(P))-binding and the S-adenosyl methionine (SAM)-dependent superfamilies. With the aim to identify the evolutionary changes that lead to a switch in coenzyme specificity on these superfamilies, we performed structural and sequence-based Hidden Markov Models to systematically search for key motifs in their coenzyme-binding pockets. Our analyses revealed how insertions and deletions (InDels) reshaped the ancient β1−loop−α1 coenzyme-binding structure of NAD(P) into the well-defined SAM-binding β1−loop−α1 structure. To prove this observation experimentally, we removed an InDel of three amino acids from the NAD(P) coenzyme pocket and solved the structure of the resulting mutant, revealing the characteristic features of the SAM-binding pocket. To confirm the binding to SAM, we performed isothermal titration calorimetry measurements, validating the successful coenzyme switch. Molecular dynamics simulations also corroborated the role of InDels in abolishing NAD-binding and acquiring SAM binding. Our results uncovered how Nature utilized insertions and deletions to switch coenzyme specificity, and in turn, functionalities between these superfamilies. This work also establishes how protein structures could have been recycled through the course of evolution to adopt different coenzymes and confer different chemistries.Significance StatementCofactors are ubiquitous molecules necessary to drive about half of the enzymatic reactions in Nature. Among them, organic cofactors (coenzymes) that contain nucleotide moieties are believed to be relics of a hypothetical RNA world. Understanding coenzyme-binding transitions sheds light onto the emergence of the first enzymes and their chemical diversity. Rossmann enzymes bind to 7 out of 10 nucleotide coenzymes, representing an ideal target to study how different coenzyme specificities emerged and evolved. Here we demonstrated how insertions and deletions reshape coenzyme-specificity in Rossmann enzymes by retracing the emergence of the SAM-binding function from an NAD-binding ancestor. This work constitutes the first example of an evolutionary bridge between redox and methylation reactions, providing a new strategy to engineer coenzyme specificity.
This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
scite is a Brooklyn-based startup that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2023 scite Inc. All rights reserved.
Made with 💙 for researchers