2021
DOI: 10.1093/nar/gkab576
|View full text |Cite
|
Sign up to set email alerts
|

Accurate assembly of minority viral haplotypes from next-generation sequencing through efficient noise reduction

Abstract: Rapidly evolving RNA viruses continuously produce minority haplotypes that can become dominant if they are drug-resistant or can better evade the immune system. Therefore, early detection and identification of minority viral haplotypes may help to promptly adjust the patient’s treatment plan preventing potential disease complications. Minority haplotypes can be identified using next-generation sequencing, but sequencing noise hinders accurate identification. The elimination of sequencing noise is a non-trivial… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
49
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 40 publications
(53 citation statements)
references
References 64 publications
0
49
0
Order By: Relevance
“…In addition, we also benchmarked reference-guided methods such as PredictHaplo [ 11 ] and CliqueSNV [ 12 ], which can reconstruct haplotypes from long-read sequencing data. However, we failed to run PredictHaplo on our long-read data sets (we have reported the so far unresolved issue at https://github.com/cbg-ethz/PredictHaplo/issues/1 ).…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…In addition, we also benchmarked reference-guided methods such as PredictHaplo [ 11 ] and CliqueSNV [ 12 ], which can reconstruct haplotypes from long-read sequencing data. However, we failed to run PredictHaplo on our long-read data sets (we have reported the so far unresolved issue at https://github.com/cbg-ethz/PredictHaplo/issues/1 ).…”
Section: Resultsmentioning
confidence: 99%
“…So far, existing methods for viral quasispecies assembly can be classified into reference-based approaches on the one hand and de novo (reference free) approaches on the other hand; see [ 9 ] for a recent review of related approaches. Reference-based methods such as ShoRAH [ 10 ], PredictHaplo [ 11 ] and CliqueSNV [ 12 ] require high quality reference for reliable reconstruction of strains and, apart from rare exceptions [ 11 , 12 ], mainly have been specializing in processing relatively error-free short read data. Importantly, high quality reference genomes may not be available precisely when they are needed the most: very often, new outbreaks of known viruses are caused by virus variants that significantly deviate from curated reference sequence [ 13 , 14 ].…”
Section: Introductionmentioning
confidence: 99%
“…*** p < 0.001, **** p < 0.0001, n.s., not significant. B Genetic diversity of the inducible cell-associated spliced HIV-1 RNA and proviral DNA based on the number of viral haplotypes using CliqueSNV ( 103 ). Unpaired t test was used to compare the number of unique HIV-1 variants (viral haplotypes) between the reservoir diversity in both cohorts of patients and between individuals infected with subtype B (U.S.) and non-B (Uganda) HIV-1 subtypes.…”
Section: Resultsmentioning
confidence: 99%
“…Implemented in the DEEPGEN™ Software Tool Suite [ 51 ], p-distance measures the proportion of different nucleotide sites between two pair of sequences (reads). Next, the number and frequency of unique HIV-1 variants (viral haplotypes) within each clinical sample was determined using CliqueSNV [ 103 ], which accurately assemblies both majority and minority (i.e., frequencies as low as 0.1%) haplotypes and estimate their frequencies within the viral population.…”
Section: Methodsmentioning
confidence: 99%
“…Strainline uses a combination of local De Bruijn graph assembly and overlap extending to generate haplotype genomes 19 . CliqueSNV constructs haplotype sequences by recognizing linked SNVs that are supported by a single read 21 . While both methods claim to assemble genomes at strain level resolution, haplotype phasing from ONT sequencing protocols for SARS-CoV-2 is challenging due to the limited read length from amplicon sequencing (250bp-500bp) 22 , uneven coverage, and susceptibility to bias from single nucleotide variation.…”
Section: Main Textmentioning
confidence: 99%