2017
DOI: 10.1101/105817
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

A recurrence based approach for validating structural variation using long-read sequencing technology

Abstract: Although numerous algorithms have been developed to identify structural variations (SVs) in genomic sequences, there is a dearth of approaches that can be used to evaluate their results. This is significant as the accurate identification of structural variation is still an outstanding but important problem in genomics. The emergence of new sequencing technologies that generate longer sequence reads can, in theory, provide direct evidence for all types of SVs regardless of the length of the region through which… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
16
0

Year Published

2017
2017
2022
2022

Publication Types

Select...
6
2

Relationship

5
3

Authors

Journals

citations
Cited by 13 publications
(16 citation statements)
references
References 20 publications
0
16
0
Order By: Relevance
“…The PacBio genome assembly-based approach overlooked this insertion, as it fell within a 'reference L1 rich' region ( Fig.4d ). Upon further inspection, this event was supported as a 403bp heterozygous L1Hs insertion by the existence of significant retrotransposition hallmarks, as well as recurrence (dot) plots 29,66 ( Fig.4e , see Methods ). This insertion also shares a high sequence identity with a nearby reference L1PA11 element ( Supplementary Table 7 ).…”
Section: Cas9 Enrichment and Nanopore Sequencing Captures Non-referenmentioning
confidence: 87%
“…The PacBio genome assembly-based approach overlooked this insertion, as it fell within a 'reference L1 rich' region ( Fig.4d ). Upon further inspection, this event was supported as a 403bp heterozygous L1Hs insertion by the existence of significant retrotransposition hallmarks, as well as recurrence (dot) plots 29,66 ( Fig.4e , see Methods ). This insertion also shares a high sequence identity with a nearby reference L1PA11 element ( Supplementary Table 7 ).…”
Section: Cas9 Enrichment and Nanopore Sequencing Captures Non-referenmentioning
confidence: 87%
“…For SNVs, we compared the calls from 3 strategies using the benchmark of NA12878 [40] and NA24385 [41]. For SVs, we compared 3 Linked-Read sets ( R 9 , R 10 , R 11 ) from HG002 with the Tier 1 SV benchmark from GIAB [42] and used VaPoR [43] to validate our SV calls based on PacBio CCS reads from NA24385 [44]. We compared SNV and SV calls among the different approaches using vcfeval [45] and truvari [42], respectively.…”
Section: Methodsmentioning
confidence: 99%
“…For computational validation, we obtained ONT reads of HG00733 from HGSVC and applied VaPoR [59], an independent structural variants validation method, to validate these CSVs ( Supplementary Note ). VaPoR is able to validate calls based predicted region and types with a confidence score.…”
Section: Methodsmentioning
confidence: 99%