2018
DOI: 10.1093/bioinformatics/bty291
|View full text |Cite
|
Sign up to set email alerts
|

Viral quasispecies reconstruction via tensor factorization with successive read removal

Abstract: MotivationAs RNA viruses mutate and adapt to environmental changes, often developing resistance to anti-viral vaccines and drugs, they form an ensemble of viral strains––a viral quasispecies. While high-throughput sequencing (HTS) has enabled in-depth studies of viral quasispecies, sequencing errors and limited read lengths render the problem of reconstructing the strains and estimating their spectrum challenging. Inference of viral quasispecies is difficult due to generally non-uniform frequencies of the stra… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
44
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
6
1

Relationship

1
6

Authors

Journals

citations
Cited by 27 publications
(44 citation statements)
references
References 28 publications
0
44
0
Order By: Relevance
“…The sampling is sparse because the reads are much shorter than the haplotypes; moreover, the reads may be erroneous due to sequencing errors. Following (Ahn 2018) , we formalize the sampling operation as…”
Section: Problem Formulationmentioning
confidence: 99%
See 3 more Smart Citations
“…The sampling is sparse because the reads are much shorter than the haplotypes; moreover, the reads may be erroneous due to sequencing errors. Following (Ahn 2018) , we formalize the sampling operation as…”
Section: Problem Formulationmentioning
confidence: 99%
“…where M is the one-to-one mapping from the set of reconstructed haplotype to the set of true haplotype (Hashemi 2018), i.e., mapping that determines the best possible match between the two sets of haplotypes. To characterize performance of methods for reconstruction of viral quasispecies with generally a priori unknown number of components, in addition to correct phasing rate we also quantify recall rate, defined as the fraction of perfectly reconstructed components in a population (i.e., recall rate = T P T P +F N ), and predicted proportion, defined as the ratio of the estimated and the true number of components in a genomic mixture (Ahn 2018).…”
Section: Problem Formulationmentioning
confidence: 99%
See 2 more Smart Citations
“…Our method is designed to cluster contigs produced by existing assembly tools. There are another group of methods conducting haplotype reconstruction via read clustering [22,23], which groups variant sites obtained by read mapping against reference genomes. These tools don't usually output contigs and thus do not use contig binning.…”
Section: Related Workmentioning
confidence: 99%