2014
DOI: 10.1007/978-3-319-05269-4_25
|View full text |Cite
|
Sign up to set email alerts
|

Viral Quasispecies Assembly via Maximal Clique Enumeration

Abstract: Virus populations can display high genetic diversity within individual hosts. The intra-host collection of viral haplotypes, called viral quasispecies, is an important determinant of virulence, pathogenesis, and treatment outcome. We present HaploClique, a computational approach to reconstruct the structure of a viral quasispecies from next-generation sequencing data as obtained from bulk sequencing of mixed virus samples. We develop a statistical model for paired-end reads accounting for mutations, insertions… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
33
0

Year Published

2015
2015
2020
2020

Publication Types

Select...
4
3

Relationship

2
5

Authors

Journals

citations
Cited by 20 publications
(33 citation statements)
references
References 44 publications
0
33
0
Order By: Relevance
“…For this error correction step, approximate suffix-prefix overlaps are computed to establish an initial read overlap graph. Inspired by Baaijens et al (2017) and Töpfer et al (2014), maximal cliques are enumerated in the non-oriented graph and errors are corrected by inspecting the read overlaps within the cliques. By design of the overlap graph-edges indicate that two reads stem from identical haplotypes-every clique only contains reads from identical haplotypes, which allows to eliminate errors based on majority votes.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…For this error correction step, approximate suffix-prefix overlaps are computed to establish an initial read overlap graph. Inspired by Baaijens et al (2017) and Töpfer et al (2014), maximal cliques are enumerated in the non-oriented graph and errors are corrected by inspecting the read overlaps within the cliques. By design of the overlap graph-edges indicate that two reads stem from identical haplotypes-every clique only contains reads from identical haplotypes, which allows to eliminate errors based on majority votes.…”
Section: Methodsmentioning
confidence: 99%
“…In terms of assembly paradigms, POLYTE is an overlap graph based approach. It adopts ideas from earlier work that either focused on variant discovery (Marschall et al, 2012), viral quasispecies assembly (Baaijens et al, 2017;Töpfer et al, 2014) or metagenome gene assembly (Gregor et al, 2016) and unites the virtues of Marschall et al (2012)the ability to handle low coverage-on the one hand, and Baaijens et al (2017); Töpfer et al (2014)-dealing with real overlap graphs and contig computation-on the other hand. That is, POLYTE brings forth an iterative overlap graph based scheme for contig generation that reliably works in low coverage settings, requiring coverage of only as low as 5x per haplotype.…”
Section: Introductionmentioning
confidence: 99%
“…The real composition of viral populations demands also new classification approaches to group components of mutant spectra (either from one isolate, from sequential isolates from one infected host, or from different hosts). Computational methods to organize and interpret the increasing numbers of minority variants being discovered in viral quasispecies have been developed (Prosperi et al, 2011;Poh et al, 2013;Gregori et al, 2014;Mangul et al, 2014;Topfer et al, 2014; for review see Marz et al, 2014). PAQ groups those viral sequences that are separated by the shortest genetic distances.…”
Section: Viral Quasispeciesmentioning
confidence: 99%
“…They often rely on the availability of closely related reference genomes of the studied species (Ahn et al, 2015;Tö pfer et al, 2014;Zagordi et al, 2011), where reads are first mapped onto a reference genome, using a read mapping tool, e.g. BWA (Li and Durbin, 2009), strain variants are then identified through a reference guided strain aware assembly.…”
Section: Introductionmentioning
confidence: 99%
“…In this line, there has been recent evidence that shorter genomes can be assembled through overlap graph based approaches, which make use of full-length reads, using short reads (Simpson and Durbin, 2012). It was also shown that one can perform strain aware assembly through iterative construction of overlap graphs (Tö pfer et al, 2014). For gene assembly from metagenomic data, the SAT assembler (Zhang et al, 2014) can be employed.…”
Section: Introductionmentioning
confidence: 99%