SM<scp>a</scp>SH: a benchmarking toolkit for human genome variant calling

Talwalkar, Ameet; Liptrap, Jesse; Newcomb, Julie; Hartl, Christopher; Terhorst, Jonathan; Curtis, Kristal; Bresler, Ma'ayan; Song, Yun S.; Jordan, Michael I.; Patterson, David A.

doi:10.1093/bioinformatics/btu345

Cited by 42 publications

(37 citation statements)

References 35 publications

(33 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…A variety of approaches have been recently developed to address the challenges in variant representation. [9][10][11]21,22 Real Time Genomics (RTG) developed the comparison tool vcfeval, which introduced the idea of comparing variants at the level of the genomic haplotypes that the variants represent as a way to overcome the problems associated with comparing complex variants, where alternative yet equivalent variant representations can confound direct comparison methods. 9 Variant "normalization" tools help to represent variants in a standardized way (e.g., by left-shifting indels in repeats), but they demonstrated that "variant normalization" approaches alone were not able to reconcile different representations of many complex variants.…”

Section: Variant Representationmentioning

confidence: 99%

“…First, benchmarking must consider that variants may be represented in multiple ways in the commonly used variant call format (VCF). [9][10][11][12] When comparing VCF files record by record, many of the putative differences are simply different representations of the same variant. Secondly, definitions for performance metrics such as true positive (TP), false positive (FP), and false negative (FN), which are key for the interpretation of the benchmarking results, are not yet standardized.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Best Practices for Benchmarking Germline Small Variant Calls in Human Genomes

Krusche¹,

Trigg

Boutros

et al. 2018

Preprint

122

189

View full text Add to dashboard Cite

Assessing accuracy of NGS variant calling is immensely facilitated by a robust benchmarking strategy and tools to carry it out in a standard way. Benchmarking variant calls requires careful attention to definitions of performance metrics, sophisticated comparison approaches, and stratification by variant type and genome context. The Global Alliance for Genomics and Health (GA4GH) Benchmarking Team has developed standardized performance metrics and tools for benchmarking germline small variant calls. This Team includes representatives from sequencing technology developers, government agencies, academic bioinformatics researchers, clinical laboratories, and commercial technology and bioinformatics developers for whom benchmarking variant calls is essential to their work. Benchmarking variant calls is a challenging problem for many reasons:• Evaluating variant calls requires complex matching algorithms and standardized counting, because the same variant may be represented differently in truth and query callsets.• Defining and interpreting resulting metrics such as precision (aka positive predictive value = TP/(TP+FP)) and recall (aka sensitivity = TP/(TP+FN)) requires standardization to draw robust conclusions about comparative performance for different variant calling methods.• Performance of NGS methods can vary depending on variant types and genome context; and as a result understanding performance requires meaningful stratification.• High-confidence variant calls and regions that can be used as "truth" to accurately identify false positives and negatives are difficult to define, and reliable calls for the most challenging regions and variants remain out of reach.We have made significant progress on standardizing comparison methods, metric definitions and reporting, as well as developing and using truth sets. Our methods are publicly available on GitHub (https://github.com/ga4gh/benchmarking-tools) and in a web-based app on precisionFDA, which allow users to compare their variant calls against truth sets and to obtain a standardized report on their variant calling performance. Our methods have been piloted in the precisionFDA variant calling challenges to identify the best-in-class variant calling methods within highconfidence regions. Finally, we recommend a set of best practices for using our tools and critically evaluating the results.

show abstract

Section: Variant Representationmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Best Practices for Benchmarking Germline Small Variant Calls in Human Genomes

Krusche¹,

Trigg

Boutros

et al. 2018

Preprint

122

189

View full text Add to dashboard Cite

show abstract

“…Second, although simulated data are widely used for their easy access, low cost, and clear constitution of positives and negatives, several common artifacts are beyond current simulation yet, such as: the non-random distribution of variants, incomplete reference genome, and copy number variations (CNVs)24. Since simulation datasets are collections of synthetic reads based on simple generative models while real datasets are much more complex and harder to call variation on, they may not truly tell the same story as real sequencing data25. Third, as for the use of indirect properties of mutation calls instead of direct validation in these studies, multiple metrics of the prediction sets must be weighed to estimate the performance of each program rather than simply counting overlaps between different methods or calculating the average read depths.…”

mentioning

confidence: 99%

In-depth comparison of somatic point mutation callers based on different tumor next-generation sequencing depth data

Cai

Yuan

Zhang

et al. 2016

Sci Rep

106

View full text Add to dashboard Cite

Four popular somatic single nucleotide variant (SNV) calling methods (Varscan, SomaticSniper, Strelka and MuTect2) were carefully evaluated on the real whole exome sequencing (WES, depth of ~50X) and ultra-deep targeted sequencing (UDT-Seq, depth of ~370X) data. The four tools returned poor consensus on candidates (only 20% of calls were with multiple hits by the callers). For both WES and UDT-Seq, MuTect2 and Strelka obtained the largest proportion of COSMIC entries as well as the lowest rate of dbSNP presence and high-alternative-alleles-in-control calls, demonstrating their superior sensitivity and accuracy. Combining different callers does increase reliability of candidates, but narrows the list down to very limited range of tumor read depth and variant allele frequency. Calling SNV on UDT-Seq data, which were of much higher read-depth, discovered additional true-positive variations, despite an even more tremendous growth in false positive predictions. Our findings not only provide valuable benchmark for state-of-the-art SNV calling methods, but also shed light on the access to more accurate SNV identification in the future.

show abstract

“…To evaluate the performance of these algorithms we used SMaSH [20], a recently developed suite of tools for benchmarking variant calling algorithms. Briefly, SMaSH is motivated by the lack of a gold-standard NGS benchmarking dataset which both a) mimics a realistic use-case (i.e.…”

Section: Datasets and Evaluationmentioning

confidence: 99%

“…We used the same window size for AllChange. For CAGe++, we set the variant calling parameters to (α 1 , α 2 , α 3 , α 4 , α 5 ) = (12,10,20,3,20).…”

mentioning

confidence: 99%

Changepoint Analysis for Efficient Variant Calling

Bloniarz

Talwalkar

Terhorst

et al. 2014

Lecture Notes in Computer Science

Self Cite

View full text Add to dashboard Cite

Abstract. We present CAGe, a statistical algorithm which exploits high sequence identity between sampled genomes and a reference assembly to streamline the variant calling process. Using a combination of changepoint detection, classification, and online variant detection, CAGe is able to call simple variants quickly and accurately on the 90-95% of a sampled genome which differs little from the reference, while correctly learning the remaining 5-10% that must be processed using more computationally expensive methods. CAGe runs on a deeply sequenced human whole genome sample in approximately 20 minutes, potentially reducing the burden of variant calling by an order of magnitude after one memory-efficient pass over the data.

show abstract

SMaSH: a benchmarking toolkit for human genome variant calling

Abstract: We provide free and open access online to the SMaSH tool kit, along with detailed documentation, at smash.cs.berkeley.edu

Cited by 42 publications

References 35 publications

Best Practices for Benchmarking Germline Small Variant Calls in Human Genomes

Best Practices for Benchmarking Germline Small Variant Calls in Human Genomes

In-depth comparison of somatic point mutation callers based on different tumor next-generation sequencing depth data

Changepoint Analysis for Efficient Variant Calling

Contact Info

Product

Resources

About