BackgroundThere are numerous computational tools for taxonomic or functional analysis of microbiome samples, optimized to run on hundreds of millions of short, high quality sequencing reads. Programs such as MEGAN allow the user to interactively navigate these large datasets. Long read sequencing technologies continue to improve and produce increasing numbers of longer reads (of varying lengths in the range of 10k-1M bps, say), but of low quality. There is an increasing interest in using long reads in microbiome sequencing, and there is a need to adapt short read tools to long read datasets.MethodsWe describe a new LCA-based algorithm for taxonomic binning, and an interval-tree based algorithm for functional binning, that are explicitly designed for long reads and assembled contigs. We provide a new interactive tool for investigating the alignment of long reads against reference sequences. For taxonomic and functional binning, we propose to use LAST to compare long reads against the NCBI-nr protein reference database so as to obtain frame-shift aware alignments, and then to process the results using our new methods.ResultsAll presented methods are implemented in the open source edition of MEGAN, and we refer to this new extension as MEGAN-LR (MEGAN long read). We evaluate the LAST+MEGAN-LR approach in a simulation study, and on a number of mock community datasets consisting of Nanopore reads, PacBio reads and assembled PacBio reads. We also illustrate the practical application on a Nanopore dataset that we sequenced from an anammox bio-rector community.ReviewersThis article was reviewed by Nicola Segata together with Moreno Zolfo, Pete James Lockhart and Serghei Mangul.ConclusionThis work extends the applicability of the widely-used metagenomic analysis software MEGAN to long reads. Our study suggests that the presented LAST+MEGAN-LR pipeline is sufficiently fast and accurate.
Background Short-read sequencing technologies have long been the work-horse of microbiome analysis. Continuing technological advances are making the application of long-read sequencing to metagenomic samples increasingly feasible. Results We demonstrate that whole bacterial chromosomes can be obtained from an enriched community, by application of MinION sequencing to a sample from an EBPR bioreactor, producing 6 Gb of sequence that assembles into multiple closed bacterial chromosomes. We provide a simple pipeline for processing such data, which includes a new approach to correcting erroneous frame-shifts. Conclusions Advances in long-read sequencing technology and corresponding algorithms will allow the routine extraction of whole chromosomes from environmental samples, providing a more detailed picture of individual members of a microbiome. Electronic supplementary material The online version of this article (10.1186/s40168-019-0665-y) contains supplementary material, which is available to authorized users.
The development of reliable, mixed-culture biotechnological processes hinges on understanding how microbial ecosystems respond to disturbances. Here we reveal extensive phenotypic plasticity and niche complementarity in oleaginous microbial populations from a biological wastewater treatment plant. We perform meta-omics analyses (metagenomics, metatranscriptomics, metaproteomics and metabolomics) on in situ samples over 14 months at weekly intervals. Based on 1,364 de novo metagenome-assembled genomes, we uncover four distinct fundamental niche types. Throughout the time-series, we observe a major, transient shift in community structure, coinciding with substrate availability changes. Functional omics data reveals extensive variation in gene expression and substrate usage amongst community members. Ex situ bioreactor experiments confirm that responses occur within five hours of a pulse disturbance, demonstrating rapid adaptation by specific populations. Our results show that community resistance and resilience are a function of phenotypic plasticity and niche complementarity, and set the foundation for future ecological engineering efforts.
Trichomonas vaginalis viruses (TVV), which may regulate P270 gene expression in the protozoan pathogen T. vaginalis, are a group of divergent double-stranded (ds) RNA viruses. In the present study, the complete 4674-bp cDNA sequence of a 4.6-kb ds RNA from a newly identified TVV2-1 isolate was determined. The sequence of the plus-strand mRNA contains four open reading frames, which encode overlapping cap and pol genes in the reading frame 2 and reading frame 1, respectively, and two putative serine-threonine-rich basic proteins VP3 and VP4 in the third reading frame. An 85-kDa capsid protein and a 160-kDa CAP-POL fusion protein were identified in crude viruses by Western blotting experiments using antisera raised against gene-specific oligopeptides. In conjunction with the presence of a potential ribosomal slippery heptanucleotide G GGC CCC within the overlap of the cap and pol genes, these observations suggest that the pol gene of TVV2-1 is translated via a -1 ribosomal frameshifting event during translation of the cap gene. Our results also provide insight into the conservation among divergent dsRNA species from TVV and suggest that the genome of TVV2-1 may encode two extra genes in addition to the cap and pol genes.
New long read sequencing technologies offer huge potential for effective recovery of complete, closed genomes from complex microbial communities. Using long read data (ONT MinION) obtained from an ensemble of activated sludge enrichment bioreactors we recover 22 closed or complete genomes of community members, including several species known to play key functional roles in wastewater bioprocesses, specifically microbes known to exhibit the polyphosphate- and glycogen-accumulating organism phenotypes (namely Candidatus Accumulibacter and Dechloromonas, and Micropruina, Defluviicoccus and Candidatus Contendobacter, respectively), and filamentous bacteria (Thiothrix) associated with the formation and stability of activated sludge flocs. Additionally we demonstrate the recovery of close to 100 circularised plasmids, phages and small microbial genomes from these microbial communities using long read assembled sequence. We describe methods for validating long read assembled genomes using their counterpart short read metagenome-assembled genomes, and assess the influence of different correction procedures on genome quality and predicted gene quality. Our findings establish the feasibility of performing long read metagenome-assembled genome recovery for both chromosomal and non-chromosomal replicons, and demonstrate the value of parallel sampling of moderately complex enrichment communities to obtaining high quality reference genomes of key functional species relevant for wastewater bioprocesses.
Background: Short-read sequencing technologies have long been the work-horse of microbiome analysis. Continuing technological advances are making the application of long-read sequencing to metagenomic samples increasingly feasible. Results:We demonstrate that whole bacterial chromosomes can be obtained from a complex community, by application of MinION sequencing to a sample from an EBPR bio-reactor, producing 6Gb of sequence that assembles in to multiple closed bacterial chromosomes. We provide a simple pipeline for processing such data, which includes a new approach to correcting erroneous frame-shifts.Conclusions: Advances in long read sequencing technology and corresponding algorithms will allow the routine extraction of whole chromosomes from environmental samples, providing a more detailed picture of individual members of a microbiome.
Background There are numerous computational tools for taxonomic or functional analysis of microbiome samples, optimized to run on hundreds of millions of short, high quality sequencing reads.Programs such as MEGAN allow the user to interactively navigate these large datasets. Long read sequencing technologies continue to improve and produce increasing numbers of longer reads (of varying lengths in the range of 10k-1M bps, say), but of low quality. There is an increasing interest in using long reads in microbiome sequencing and there is a need to adapt short read tools to long read datasets. MethodsWe describe a new LCA-based algorithm for taxonomic binning, and an interval-tree based algorithm for functional binning, that are explicitly designed for long reads and assembled contigs. We provide a new interactive tool for investigating the alignment of long reads against reference sequences.For taxonomic and functional binning, we propose to use LAST to compare long reads against the NCBInr protein reference database so as to obtain frame-shift aware alignments, and then to process the results using our new methods.Results All presented methods are implemented in the open source edition of MEGAN and we refer to this new extension as MEGAN-LR (MEGAN long read). We evaluate the LAST+MEGAN-LR approach * To whom correspondence should be addressed.1 in a simulation study, and on a number of mock community datasets consisting of Nanopore reads, PacBio reads and assembled PacBio reads. We also illustrate the practical application on a Nanopore dataset that we sequenced from an anammox bio-rector community. BackgroundThere are numerous computational tools for taxonomic or functional binning or profiling of microbiome samples, optimized to run on hundreds of millions of short, high quality sequencing reads [1,2,3,4].Alignment-based taxonomic binning is often performed using the naïve LCA algorithm [5], because it is fast, easy to interpret and easy to implement. Functional binning usually involves a best-hit strategy to assign reads to functional classes.Software or websites for analyzing microbiome shotgun sequencing samples usually provide some level of interactivity, such as MG-RAST [2]. The interactive microbiome analysis tool MEGAN, which was first used in 2006 [6], is one of the most feature-rich tools of this type. MEGAN is highly optimized to enable users to interactively explore large numbers of microbiome samples containing hundreds of millions of short reads.Illumina HiSeq and MiSeq sequencers allow researchers to generate sequencing data on a huge scale, so as to analyze many samples at a great sequencing depth [7,8,9]. A wide range of questions, in particular involving the presence or absence of particular organisms or genes in a sample, can be answered using such data. However, there are interesting problems that are not easily resolved using short reads. For example, the question whether two genes, which both are detected in the same microbiome sample, also occur together on the same genome, can often not be ...
A type III Trichomonas vaginalis virus, which may be involved in transcriptional regulation of the major surface protein gene P270 of the protozoan pathogen Trichomonas vaginalis, was purified and characterized in the present study. The complete 4844-base-pair complementary DNA sequence of the viral genome reveals overlapping cap and pol genes with a putative ribosomal frame-shifting signal within the overlap region. The type III virus is related more closely to the type II virus than to the type I virus in the sequence of its ribosomal frameshift signal and in its capsid protein. Phylogenetic analysis revealed that these viruses could be grouped in the same clade as a genus distantly related to other genera in the family Totiviridae. Virus-induced P270 gene expression was only evident in Trichomonas vaginalis cells infected with either a type II or type III virus, but not with a type I virus. These findings suggest that transcription of the P270 gene is likely regulated by viral factors common to type II and type III viruses and thus provides important information for future investigation of virus-host interactions.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.