1Advances in DNA sequencing have made it feasible to gather genomic data for non-model 2 organisms and large sets of individuals, often using methods for sequencing subsets of the 3 genome. Several of these methods sequence DNA associated with endonuclease restriction 4 sites (various RAD and GBS methods). For use in taxa without a reference genome, these 5 methods rely on de novo assembly of fragments in the sequencing library. Many of the soft-6 ware options available for this application were originally developed for other assembly types 7 and we do not know their accuracy for reduced representation libraries. To address this im-8 portant knowledge gap, we simulated data from the Arabidopsis thaliana and Homo sapiens 9 genomes and compared de novo assemblies by six software programs that are commonly 10 used or promising for this purpose (ABySS, CD-HIT, Stacks, Stacks2, Velvet and VSEARCH). 11We simulated different mutation rates and types of mutations, and then applied the six 12 assemblers to the simulated datasets, varying assembly parameters. We found substantial 13 variation in software performance across simulations and parameter settings. ABySS failed 14 to recover any true genome fragments, and Velvet and VSEARCH performed poorly for most 15 simulations. Stacks and Stacks2 produced accurate assemblies of simulations containing 16 SNPs, but the addition of insertion and deletion mutations decreased their performance. 17CD-HIT was the only assembler that consistently recovered a high proportion of true genome 18 fragments. Here, we demonstrate the substantial difference in the accuracy of assemblies 19 from different software programs and the importance of comparing assemblies that result 20 from different parameter settings. 21
New methods to characterize microbiomes reduce technology-imposed limitations to study design, but many new approaches have not been widely adopted. Here, we present techniques to increase throughput and reduce contamination alongside a thorough review of current best practices.
Advances in DNA sequencing have made it feasible to gather genomic data for non‐model organisms and large sets of individuals, often using methods for sequencing subsets of the genome. Several of these methods sequence DNA associated with endonuclease restriction sites (various RAD and GBS methods). For use in taxa without a reference genome, these methods rely on de novo assembly of fragments in the sequencing library. Many of the software options available for this application were originally developed for other assembly types and we do not know their accuracy for reduced representation libraries. To address this important knowledge gap, we simulated data from the Arabidopsis thaliana and Homo sapiens genomes and compared de novo assemblies by six software programs that are commonly used or promising for this purpose (ABySS, CD‐HIT, Stacks, Stacks2, Velvet and VSEARCH). We simulated different mutation rates and types of mutations, and then applied the six assemblers to the simulated data sets, varying assembly parameters. We found substantial variation in software performance across simulations and parameter settings. ABySS failed to recover any true genome fragments, and Velvet and VSEARCH performed poorly for most simulations. Stacks and Stacks2 produced accurate assemblies of simulations containing SNPs, but the addition of insertion and deletion mutations decreased their performance. CD‐HIT was the only assembler that consistently recovered a high proportion of true genome fragments. Here, we demonstrate the substantial difference in the accuracy of assemblies from different software programs and the importance of comparing assemblies that result from different parameter settings.
In aquatic systems, microbes likely play critical roles in biogeochemical cycling and ecosystem processes, but much remains to be learned regarding microbial biogeography and ecology. The microbial ecology of mountain lakes is particularly understudied. We hypothesized that microbial distribution among lakes is shaped, in part, by aquatic plant communities and the biogeochemistry of the lake. Specifically, we investigated the associations of yellow water lilies (Nuphar polysepala) with the biogeochemistry and microbial assemblages within mountain lakes at two scales: within a single lake and among lakes within a mountain range. We first compared the biogeochemistry of lakes without water lilies to those colonized to varying degrees by water lilies. Lakes with >10% of the surface occupied by water lilies had lower pH and higher dissolved organic carbon than those without water lilies and had a different microbial composition. Notably, cyanobacteria were negatively associated with water lily presence, a result consistent with the past observation that macrophytes outcompete phytoplankton and can suppress cyanobacterial and algal blooms. To examine the influence of macrophytes on microbial distribution within a lake, we characterized microbial assemblages present on abaxial and adaxial water lily leaf surfaces and in the water column. Microbial diversity and composition varied among all three habitats, with the highest diversity of microbes observed on the adaxial side of leaves. Overall, this study suggests that water lilies influence the biogeochemistry and microbiology of mountains lakes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.