2018
DOI: 10.1093/bioinformatics/bty873
|View full text |Cite|
|
Sign up to set email alerts
|

Expanded functionality, increased accuracy, and enhanced speed in the de novo genotyping-by-sequencing pipeline GBS-SNP-CROP

Abstract: Summary GBS-SNP-CROP is a bioinformatics pipeline originally developed to support the cost-effective genome-wide characterization of plant genetic resources through paired-end genotyping-by-sequencing (GBS), particularly in the absence of a reference genome. Since its 2016 release, the pipeline’s functionality has greatly expanded, its computational efficiency has improved, and its applicability to a broad set of genomic studies for both plants and animals has been demonstrated. This note details … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
12
0

Year Published

2019
2019
2021
2021

Publication Types

Select...
5
2

Relationship

1
6

Authors

Journals

citations
Cited by 11 publications
(12 citation statements)
references
References 26 publications
(23 reference statements)
0
12
0
Order By: Relevance
“…Raw FASTQ files were generated by CASAVA 1.8.3 and analyzed using the reference-free bioinformatics pipeline GBS-SNP-CROP [38, 66]. A Mock Reference (MR) was constructed using the high quality PE reads from the two parents; and putative variants, both SNPs and indels, were identified via alignment of high quality PE reads from the parents and all F 1 progeny to the MR, following the pipeline’s recommended parameters for diploid species.…”
Section: Methodsmentioning
confidence: 99%
“…Raw FASTQ files were generated by CASAVA 1.8.3 and analyzed using the reference-free bioinformatics pipeline GBS-SNP-CROP [38, 66]. A Mock Reference (MR) was constructed using the high quality PE reads from the two parents; and putative variants, both SNPs and indels, were identified via alignment of high quality PE reads from the parents and all F 1 progeny to the MR, following the pipeline’s recommended parameters for diploid species.…”
Section: Methodsmentioning
confidence: 99%
“…Because a reference genome of caraway is not available, a reference was built using vsearch (v2.7.1_linux_x86_64) including a dereplication (default parameter) and a clustering (non-default parameter: cluster_fast, id 0.93, sizein True, sizeout True) [33]. In detail, the built reference can be called a 'mock reference', composed of consensus GBS fragments [34]. Reads were mapped against this reference using BWA-mem (v0.7.15-r1140) [35].…”
Section: Snp Discoverymentioning
confidence: 99%
“…Alongside its widespread use as a molecular protocol, a variety of bioinformatic software has been designed to work specifically with RADseq data (Catchen et al., 2011; Catchen, Hohenlohe, Bassham, Amores, & Cresko, 2013; Chong, Ruan, & Wu, 2012; Eaton, 2014; Eaton & Overcast, 2020; Melo & Hale, 2019; Puritz, Hollenbeck, & Gold, 2014), and methods have been developed to optimize the application of these software after data generation (Ilut, Nydam, & Hare, 2014; McCartney‐Melstad, Gidiş, & Shaffer, 2019; Paris, Stevens, & Catchen, 2017; Rochette & Catchen, 2017). However, software and parameter optimization protocols are not effective if the underlying sequenced data has captured little of the true biological signal.…”
Section: Introductionmentioning
confidence: 99%