Expanded functionality, increased accuracy, and enhanced speed in the <i>de novo</i> genotyping-by-sequencing pipeline GBS-SNP-CROP

Melo, Arthur T. O.; Hale, Iago

doi:10.1093/bioinformatics/bty873

Cited by 11 publications

(12 citation statements)

References 26 publications

(23 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Raw FASTQ files were generated by CASAVA 1.8.3 and analyzed using the reference-free bioinformatics pipeline GBS-SNP-CROP [38, 66]. A Mock Reference (MR) was constructed using the high quality PE reads from the two parents; and putative variants, both SNPs and indels, were identified via alignment of high quality PE reads from the parents and all F 1 progeny to the MR, following the pipeline’s recommended parameters for diploid species.…”

Section: Methodsmentioning

confidence: 99%

Mapping non-host resistance to the stem rust pathogen in an interspecific barberry hybrid

et al. 2019

Self Cite

View full text Add to dashboard Cite

Background Non-host resistance (NHR) presents a compelling long-term plant protection strategy for global food security, yet the genetic basis of NHR remains poorly understood. For many diseases, including stem rust of wheat [causal organism Puccinia graminis ( Pg )], NHR is largely unexplored due to the inherent challenge of developing a genetically tractable system within which the resistance segregates. The present study turns to the pathogen’s alternate host, barberry ( Berberis spp.), to overcome this challenge. Results In this study, an interspecific mapping population derived from a cross between Pg -resistant Berberis thunbergii ( Bt ) and Pg -susceptible B. vulgaris was developed to investigate the Pg -NHR exhibited by Bt. To facilitate QTL analysis and subsequent trait dissection, the first genetic linkage maps for the two parental species were constructed and a chromosome-scale reference genome for Bt was assembled (PacBio + Hi-C). QTL analysis resulted in the identification of a single 13 cM region (~ 5.1 Mbp spanning 13 physical contigs) on the short arm of Bt chromosome 3. Differential gene expression analysis, combined with sequence variation analysis between the two parental species, led to the prioritization of several candidate genes within the QTL region, some of which belong to gene families previously implicated in disease resistance. Conclusions Foundational genetic and genomic resources developed for Berberis spp. enabled the identification and annotation of a QTL associated with Pg -NHR. Although subsequent validation and fine mapping studies are needed, this study demonstrates the feasibility of and lays the groundwork for dissecting Pg -NHR in the alternate host of one of agriculture’s most devastating pathogens. Electronic supplementary material The online version of this article (10.1186/s12870-019-1893-9) contains supplementary material, which is available to authorized users.

show abstract

Section: Methodsmentioning

confidence: 99%

Mapping non-host resistance to the stem rust pathogen in an interspecific barberry hybrid

et al. 2019

Self Cite

View full text Add to dashboard Cite

show abstract

“…Because a reference genome of caraway is not available, a reference was built using vsearch (v2.7.1_linux_x86_64) including a dereplication (default parameter) and a clustering (non-default parameter: cluster_fast, id 0.93, sizein True, sizeout True) [33]. In detail, the built reference can be called a 'mock reference', composed of consensus GBS fragments [34]. Reads were mapped against this reference using BWA-mem (v0.7.15-r1140) [35].…”

Section: Snp Discoverymentioning

confidence: 99%

On genetic diversity in caraway: Genotyping of a large germplasm collection

et al. 2020

View full text Add to dashboard Cite

Caraway (Carum carvi) is a widespread and frequently used spice and medicinal plant with a long history of cultivation. However, due to ongoing climatic changes, the cultivation is becoming increasingly risky. To secure caraway cultivation in future, timely breeding efforts to develop adapted material are necessary. Analysis of genetic diversity can accompany this process, for instance, by revealing untapped gene pools. Here, we analyzed 137 accessions using genotyping by sequencing (GBS). Hence, we can report a broad overview of population structure and genetic diversity of caraway. Population structure was determined using a principal coordinate analysis, a Bayesian clustering analysis, phylogenetic trees and a neighbor network based on 13,155 SNPs. Genotypic data indicate a clear separation of accessions into two subpopulations, which correlates with the flowering type (annual vs. biennial). Four winter-annual accessions were closer related to biennial accessions. In an analysis of molecular variance, genetic variation between the two subpopulations was 7.84%. In addition, we estimated the genome size for 35 accessions by flow cytometry. An average genome size of 4.282 pg/2C (± 0.0096 S.E.) was estimated. Therefore, we suggest a significantly smaller genome size than stated in literature.

show abstract

“…Alongside its widespread use as a molecular protocol, a variety of bioinformatic software has been designed to work specifically with RADseq data (Catchen et al., 2011; Catchen, Hohenlohe, Bassham, Amores, & Cresko, 2013; Chong, Ruan, & Wu, 2012; Eaton, 2014; Eaton & Overcast, 2020; Melo & Hale, 2019; Puritz, Hollenbeck, & Gold, 2014), and methods have been developed to optimize the application of these software after data generation (Ilut, Nydam, & Hare, 2014; McCartney‐Melstad, Gidiş, & Shaffer, 2019; Paris, Stevens, & Catchen, 2017; Rochette & Catchen, 2017). However, software and parameter optimization protocols are not effective if the underlying sequenced data has captured little of the true biological signal.…”

Section: Introductionmentioning

confidence: 99%

Simulation with RADinitio improves RADseq experimental design and sheds light on sources of missing data

Rivera‐Colón

Rochette

Catchen

2020

Molecular Ecology Resources

View full text Add to dashboard Cite

Restriction‐site associated DNA sequencing (RADseq) has become a powerful and versatile tool in modern population genomics, enabling large‐scale evolutionary and genomic analyses in otherwise inaccessible biological systems. With its widespread use, different variants on the protocol have been developed to suit specific experimental needs. Researchers face the challenge of choosing the optimal molecular and sequencing protocols for their reduced representation experimental design, an often‐complicated process. Strategic errors can lead to biased data generation that has reduced power to answer biological questions. Here, we present RADinitio, simulation software for the selection and optimization of RADseq experiments via the generation of sequencing data that behave similarly to empirical sources. RADinitio provides an evolutionary simulation of populations, implementation of various RADseq protocols with customizable parameters, and thorough assessment of missing data. We test the efficacy of the software using different RAD protocols across several organisms, highlighting the importance of protocol selection on the magnitude and quality of data acquired. Additionally, we test the effects of RAD library preparation and sequencing on allelic dropout, observing that library preparation and sequencing often contributes more to missing alleles than population‐level variation.

show abstract

Expanded functionality, increased accuracy, and enhanced speed in the de novo genotyping-by-sequencing pipeline GBS-SNP-CROP

Cited by 11 publications

References 26 publications

Mapping non-host resistance to the stem rust pathogen in an interspecific barberry hybrid

Mapping non-host resistance to the stem rust pathogen in an interspecific barberry hybrid

On genetic diversity in caraway: Genotyping of a large germplasm collection

Simulation with RADinitio improves RADseq experimental design and sheds light on sources of missing data

Contact Info

Product

Resources

About