Citation: Hale, H., E. M. Gardner, J. Viruel, L. Pokorny, and M. G. Johnson. 2020. Strategies for reducing per-sample costs in target capture sequencing for phylogenomics and population genomics in plants. Applications in Plant Sciences 8(4): e11337.The reduced cost of high-throughput sequencing and the development of gene sets with wide phylogenetic applicability has led to the rise of sequence capture methods as a plausible platform for both phylogenomics and population genomics in plants. An important consideration in large targeted sequencing projects is the per-sample cost, which can be inflated when using off-the-shelf kits or reagents not purchased in bulk. Here, we discuss methods to reduce per-sample costs in high-throughput targeted sequencing projects. We review the minimal equipment and consumable requirements for targeted sequencing while comparing several alternatives to reduce bulk costs in DNA extraction, library preparation, target enrichment, and sequencing. We consider how each of the workflow alterations may be affected by DNA quality (e.g., fresh vs. herbarium tissue), genome size, and the phylogenetic scale of the project. We provide a cost calculator for researchers considering targeted sequencing to use when designing projects, and identify challenges for future development of low-cost sequencing in non-model plant systems. KEY WORDS enzymatic fragmentation; herbariomics; high-throughput workflow implementation; Hyb-Seq; low-cost sequence capture; pooling and multiplexing strategies. Applications in Plant Sciences 2020 8(4): e11337 Hale et al.-Low-cost Hyb-Seq • 2 of 10
The apple cultivar ‘Honeycrisp’ has superior fruit quality traits, cold hardiness, and disease resistance, making it a popular breeding parent. However, it suffers from several physiological disorders, production, and postharvest issues. Despite several available apple genome sequences, understanding of the genetic mechanisms underlying cultivar-specific traits remains lacking. Here, we present a highly contiguous, fully phased, chromosome-level genome of ‘Honeycrisp’ apples, using PacBio HiFi, Omni-C, and Illumina sequencing platforms, with two assembled haplomes of 674 Mbp and 660 Mbp, and contig N50 values of 32.8 Mbp and 31.6 Mbp, respectively. Overall, 47,563 and 48,655 protein-coding genes were annotated from each haplome, capturing 96.8–97.4% complete BUSCOs in the eudicot database. Gene family analysis reveals most ‘Honeycrisp’ genes are assigned into orthogroups shared with other genomes, with 121 ‘Honeycrisp’-specific orthogroups. This resource is valuable for understanding the genetic basis of important traits in apples and related Rosaceae species to enhance breeding efforts.
The estimation of demographic parameters in natural populations is a critical tool for species delimitation (Duminil and Di Michele, 2009), biogeography studies (Overcast et al., 2019), and monitoring of populations and species in a dynamically changing environment (Allendorf et al., 2010). The feasibility of estimating demographic parameters (including heterozygosity, effective population size, and levels of introgression) in non-model taxa relies on retrieving homologous markers that allow detection of sufficient variation across the genome, while remaining cost-effective for the analysis of hundreds of individuals. In plants, population genomic studies could benefit from markers that enable the further unlocking of herbarium specimens for botanical research, paralleling the impact of herbarium specimens in phylogenomics (e.g., Shee et al., 2020), microbiome research (e.g., Heberling and Burke, 2019), and studies of the effects of climate change on plant populations (e.g., Miller-Rushing et al., 2009). Traditional Sanger sequencing of PCR amplicons often employs universal primer sequences, but the genes targeted (e.g., the plastid markers matK and rbcL, or the nuclear ribosomal ITS) do not
Targeted sequencing using Angiosperms353 has emerged as a low-cost tool for phylogenetics, with early results spanning scales from all flowering plants to within genera. The use of universal markers at narrower scales- within populations- would eliminate the need for specific marker development while retaining the benefits of full-gene sequences. However, it is unclear whether the Angiosperms353 markers provide sufficient variation within species to calculate demographic parameters. Using herbarium specimens from a 50-year-old floristic survey of Guadalupe Mountains National Park, we sequenced 95 samples from 24 species using Angiosperms353. We adapted a data workflow to process targeted sequencing data that calls variants within each species and prepares data for population genetic analysis. We calculated genetic diversity using standard metrics (e.g. heterozygosity, Tajima's D). Angiosperms353 gene recovery was associated with genomic library concentration, with limited phylogenetic bias. We identified over 1000 segregating variants with zero missing data within 22 of 24 species. A subset of these variants, which were filtered to remove linked SNPs, revealed high heterozygosity in many species. Tajima's D calculated within each species indicated a moderate number of markers potentially under selection and identified evidence of population bottlenecks in some species. Despite sequencing few individuals per species, the Angiosperms353 markers contained sufficient variation calculate demographic parameters. Larger sampling within species will allow for estimating gene flow and population dynamics in any angiosperm. Our study will benefit conservation genetics, where Angiosperms353 provides universal repeatable markers, low missing data, and haplotype information.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.