This study identifies and analyzes statistically significant overlaps between selective sweep screens in anatomically modern humans and several domesticated species. The results obtained suggest that (paleo-)genomic data can be exploited to complement the fossil record and support the idea of self-domestication in Homo sapiens, a process that likely intensified as our species populated its niche. Our analysis lends support to attempts to capture the “domestication syndrome” in terms of alterations to certain signaling pathways and cell lineages, such as the neural crest.
The efficiency of microorganisms to degrade lignified plants is of great importance in the Earth’s carbon cycle, but also in industrial biorefinery processes, such as for biofuel production. Here, we present a large-scale proteomics approach to investigate and compare the enzymatic response of five filamentous fungi when grown on five very different substrates: grass (sugarcane bagasse), hardwood (birch), softwood (spruce), cellulose and glucose. The five fungi included the ascomycetes Aspergillus terreus, Trichoderma reesei, Myceliophthora thermophila, Neurospora crassa and the white-rot basidiomycete Phanerochaete chrysosporium, all expressing a diverse repertoire of enzymes. In this study, we present comparable quantitative protein abundance values across five species and five diverse substrates. The results allow for direct comparison of fungal adaptation to the different substrates, give indications as to the substrate specificity of individual carbohydrate-active enzymes (CAZymes), and reveal proteins of unknown function that are co-expressed with CAZymes. Based on the results, we present a quantitative comparison of 34 lytic polysaccharide monooxygenases (LPMOs), which are crucial enzymes in biomass deconstruction.
While the field of microbiology has adapted to the study of complex microbiomes via modern meta-omics techniques, we have not updated our basic knowledge regarding the quantitative levels of DNA, RNA and protein molecules within a microbial cell, which ultimately control cellular function. Here we report the temporal measurements of absolute RNA and protein levels per gene within a mixed bacterial-archaeal consortium. Our analysis of this data reveals an absolute protein-to-RNA ratio of 102–104 for bacterial populations and 103–105 for an archaeon, which is more comparable to Eukaryotic representatives’ humans and yeast. Furthermore, we use the linearity between the metaproteome and metatranscriptome over time to identify core functional guilds, hence using a fundamental biological feature (i.e., RNA/protein levels) to highlight phenotypical complementarity. Our findings show that upgrading multi-omic toolkits with traditional absolute measurements unlocks the scaling of core biological questions to dynamic and complex microbiomes, creating a deeper insight into inter-organismal relationships that drive the greater community function.
Background The rapid development of the (meta-)omics fields has produced an unprecedented amount of high-resolution and high-fidelity data. Through the use of these datasets we can infer the role of previously functionally unannotated proteins from single organisms and consortia. In this context, protein function annotation can be described as the identification of regions of interest (i.e., domains) in protein sequences and the assignment of biological functions. Despite the existence of numerous tools, challenges remain in terms of speed, flexibility, and reproducibility. In the big data era, it is also increasingly important to cease limiting our findings to a single reference, coalescing knowledge from different data sources, and thus overcoming some limitations in overly relying on computationally generated data from single sources. Results We implemented a protein annotation tool, Mantis, which uses database identifiers intersection and text mining to integrate knowledge from multiple reference data sources into a single consensus-driven output. Mantis is flexible, allowing for the customization of reference data and execution parameters, and is reproducible across different research goals and user environments. We implemented a depth-first search algorithm for domain-specific annotation, which significantly improved annotation performance compared to sequence-wide annotation. The parallelized implementation of Mantis results in short runtimes while also outputting high coverage and high-quality protein function annotations. Conclusions Mantis is a protein function annotation tool that produces high-quality consensus-driven protein annotations. It is easy to set up, customize, and use, scaling from single genomes to large metagenomes. Mantis is available under the MIT license at https://github.com/PedroMTQ/mantis.
Microbial communities that degrade lignocellulosic biomass are typified by high levels of species- and strain-level complexity, as well as synergistic interactions between both cellulolytic and non-cellulolytic microorganisms. Coprothermobacter proteolyticus frequently dominates thermophilic, lignocellulose-degrading communities with wide geographical distribution, which is in contrast to reports that it ferments proteinaceous substrates and is incapable of polysaccharide hydrolysis. Here we deconvolute a highly efficient cellulose-degrading consortium (SEM1b) that is co-dominated by Clostridium (Ruminiclostridium) thermocellum and multiple heterogenic strains affiliated to C. proteolyticus . Metagenomic analysis of SEM1b recovered metagenome-assembled genomes (MAGs) for each constituent population, whereas in parallel two novel strains of C. proteolyticus were successfully isolated and sequenced. Annotation of all C. proteolyticus genotypes (two strains and one MAG) revealed their genetic acquisition of carbohydrate-active enzymes (CAZymes), presumably derived from horizontal gene transfer (HGT) events involving polysaccharide-degrading Firmicutes or Thermotogae-affiliated populations that are historically co-located. HGT material included a saccharolytic operon, from which a CAZyme was biochemically characterized and demonstrated hydrolysis of multiple hemicellulose polysaccharides. Finally, temporal genome-resolved metatranscriptomic analysis of SEM1b revealed expression of C. proteolyticus CAZymes at different SEM1b life stages as well as co-expression of CAZymes from multiple SEM1b populations, inferring deeper microbial interactions that are dedicated toward community degradation of cellulose and hemicellulose. We show that C. proteolyticus , a ubiquitous population, consists of closely related strains that have adapted via HGT to presumably degrade both oligo- and longer polysaccharides present in decaying plants and microbial cell walls, thus explaining its dominance in thermophilic anaerobic digesters on a global scale.
The genetic plasticity of Coprothermobacter 23 24 25 COMPETING INTERESTS 26 The authors declare there are no competing financial interests in relation to the work described. 27 28 29 KEYWORDS 30 ABSTRACT 32Microbial communities that degrade lignocellulosic biomass are typified by high levels of 33 species-and strain-level complexity as well as synergistic interactions between both 34 cellulolytic and non-cellulolytic microorganisms. Coprothermobacter proteolyticus 35 frequently dominates thermophilic, lignocellulose-degrading communities with wide 36 geographical distribution, which is in contrast to reports that it ferments proteinaceous 37 substrates and is incapable of polysaccharide hydrolysis. Here we deconvolute a highly 38 efficient cellulose-degrading consortium (SEM1b) that is co-dominated by Clostridium 39 (Ruminiclostridium) thermocellum-and multiple heterogenic strains affiliated to C. 40 proteolyticus. Metagenomic analysis of SEM1b recovered metagenome-assembled genomes 41 (MAGs) for each constituent population, whilst in parallel two novel strains of C. proteolyticus 42 were successfully isolated and sequenced. Annotation of all C. proteolyticus genotypes (two 43 strains and one MAG) revealed their genetic acquisition of various carbohydrate-active 44 enzymes (CAZymes), presumably derived from horizontal gene transfer (HGT) events 45 involving C. thermocellum-or Thermotogae-affiliated populations that are historically co-46 located. HGT material included whole saccharolytic operons and dockerin-encoding 47 enzymatic subunits that are synonymous with cellulosomes. Finally, temporal genome-48 resolved metatranscriptomic analysis of SEM1b revealed expression of C. proteolyticus 49 CAZymes at different SEM1b life-stages as well as co-expression of CAZymes from multiple 50 SEM1b populations, inferring deeper microbial interactions that are dedicated towards co-51 degradation of cellulose and hemicellulose. We show that C. proteolyticus, a ubiquitous 52 keystone population, consists of closely related strains that have adapted via HGT to degrade 53 both oligo-and longer polysaccharides present in decaying plants and microbial cell walls, 54 thus explaining its dominance in thermophilic anaerobic digesters on a global scale. 55 56 57 58 59 60 61 62 al., 2010). Strain-level genomic variations typically consist of single-nucleotide variants 94 (SNVs) as well as acquisition/loss of genomic elements such as genes, operons or plasmids 95 via horizontal gene transfer (HGT) (Koskella and Vos 2015, Tettelin et al., 2005, Treangen 96 and Rocha 2011). Variability in gene content caused by HGT is typically attributed to phage-97 related genes and other genes of unknown function (Ochman et al., 2000), and can give rise 98 to ecological adaptation, niche differentiation and eventually speciation (Bendall et al., 2016, 99 Biller et al., 2015, Shapiro et al., 2012). Although differences in genomic features can be 100 accurately characterized in isolated strains, it has been difficult to capture such information 101...
Background: The past decades have seen a rapid development of the (meta-)omics fields, producing an unprecedented amount of data. Through the use of well-characterized datasets we can infer the role of previously functionally unannotated proteins from single organisms and consortia. In this context, protein function annotation allows the identification of regions of interest (i.e. domains) in protein sequences and the assignment of biological functions. Despite the existence of numerous tools, some challenges remain, specifically in terms of speed, flexibility, and reproducibility. In the era of big data it also becomes increasingly important to cease limiting our findings to a single reference, coalescing knowledge from different data sources, thus overcoming some limitations in overly relying on computationally generated data. Results: We implemented a protein annotation tool - Mantis, which uses text mining to integrate knowledge from multiple reference data sources into a single consensus-driven output. Mantis is flexible, allowing for total customization of the reference data used, adaptable, and reproducible across different research goals and user environments. We implemented a depth-first search algorithm for domain-specific annotation, which led to an average 0.038 increase in precision when compared to sequence-wide annotation. Mantis is fast, annotating an average genome in 25-40 minutes, whilst also outputting high-quality annotations (average coverage 81.4\%, average precision 0.892). Conclusions: Mantis is a protein function annotation tool that produces high-quality consensus-driven protein annotations. It is easy to set up, customize, and use, scaling from single genomes to large metagenomes. Mantis is available under the MIT license available at https://github.com/PedroMTQ/mantis
Monitoring SARS-CoV-2 in wastewater has shown to be an effective tool for epidemiological surveillance. More specifically, RNA levels determined with RT-qPCR have been shown to track with the infection dynamics within the population. However, the surveillance of individual lineages circulating in the population based on genomic sequencing of wastewater samples is challenging, as the genetic material constitutes a mixture of different viral haplotypes. Here, we identify specific signature mutations from individual SARS-CoV-2 lineages in wastewater samples to estimate lineages circulating in Luxembourg. We compare circulating lineages and mutations to those detected in clinical samples amongst infected individuals. We show that especially for dominant lineages, the allele frequencies of signature mutations correspond to the occurrence of particular lineages in the population. In addition, we provide evidence that regional clusters can also be discerned. We focused on the time period between November 2020 and March 2021 in which several variants of concern emerged and specifically traced the lineage B.1.1.7, which became dominant in Luxembourg during that time. During the subsequent time points, we were able to reconstruct short haplotypes, highlighting the co-occurrence of several signature mutations. Our results highlight the potential of genomic surveillance in wastewater samples based on amplicon short-read data. By extension, our work provides the basis for the early detection of novel SARS-CoV-2 variants.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.