Efficient error correction for next-generation sequencing of viral amplicons

Skums, Pavel; Dimitrova, Zoya; Campo, David S.; Vaughan, Gilberto; Rossi, Liana Chesini; Forbi, Joseph C.; Yokosawa, Jonny; Zelikovsky, Alexander; Khudyakov, Yury

doi:10.1186/1471-2105-13-s10-s6

Cited by 95 publications

(75 citation statements)

References 27 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The obtained data sets were processed by the sequential application of the algorithms k-mer error correction (KEC) and a customized version of empirical threshold (ET) (31). Skums et al (31) have previously demonstrated this process to be highly accurate in finding true haplotypes and removing false haplotypes.…”

Section: Methodsmentioning

confidence: 99%

See 1 more Smart Citation

Analysis of the Evolution and Structure of a Complex Intrahost Viral Population in Chronic Hepatitis C Virus Mapped by Ultradeep Pyrosequencing

Palmer

Dimitrova

Skums

et al. 2014

J Virol

Self Cite

View full text Add to dashboard Cite

Hepatitis C virus (HCV) causes chronic infection in up to 50% to 80% of infected individuals. Hypervariable region 1 (HVR1) variability is frequently studied to gain an insight into the mechanisms of HCV adaptation during chronic infection, but the changes to and persistence of HCV subpopulations during intrahost evolution are poorly understood. In this study, we used ultradeep pyrosequencing (UDPS) to map the viral heterogeneity of a single patient over 9.6 years of chronic HCV genotype 4a infection. Informed error correction of the raw UDPS data was performed using a temporally matched clonal data set. The resultant data set reported the detection of low-frequency recombinants throughout the study period, implying that recombination is an active mechanism through which HCV can explore novel sequence space. The data indicate that polyvirus infection of hepatocytes has occurred but that the fitness quotients of recombinant daughter virions are too low for the daughter virions to compete against the parental genomes. The subpopulations of parental genomes contributing to the recombination events highlighted a dynamic virome where subpopulations of variants are in competition. In addition, we provide direct evidence that demonstrates the growth of subdominant populations to dominance in the absence of a detectable humoral response. IMPORTANCEAnalysis of ultradeep pyrosequencing data sets derived from virus amplicons frequently relies on software tools that are not optimized for amplicon analysis, assume random incorporation of sequencing errors, and are focused on achieving higher specificity at the expense of sensitivity. Such analysis is further complicated by the presence of hypervariable regions. In this study, we made use of a temporally matched reference sequence data set to inform error correction algorithms. Using this methodology, we were able to (i) detect multiple instances of hepatitis C virus intrasubtype recombination at the E1/E2 junction (a phenomenon rarely reported in the literature) and (ii) interrogate the longitudinal quasispecies complexity of the virome. Parallel to the UDPS, isolation of IgG-bound virions was found to coincide with the collapse of specific viral subpopulations.

show abstract

Section: Methodsmentioning

confidence: 99%

“…In stage 1, the set of k-mers (substring of fixed length k) of reads from the processed data set is calculated and the distribution of frequencies of k-mers is analyzed (31). It was previously observed that the frequencies of erroneous and correct k-mers follow different distributions (32)(33)(34).…”

Section: Methodsmentioning

confidence: 99%

Analysis of the Evolution and Structure of a Complex Intrahost Viral Population in Chronic Hepatitis C Virus Mapped by Ultradeep Pyrosequencing

Palmer

Dimitrova

Skums

et al. 2014

J Virol

Self Cite

View full text Add to dashboard Cite

show abstract

“…The resultant data files were sequentially processed through implementation of the k-mer error correction (KEC) and empirical threshold algorithms as previously described, using the parameters k ϭ 25 and i ϭ 3 (22,27). A panel of clonal sequences temporally matched to the UDPS data was used to further identify and correct homopolymer errors (22,25).…”

Section: Methodsmentioning

confidence: 99%

Network Analysis of the Chronic Hepatitis C Virome Defines Hypervariable Region 1 Evolutionary Phenotypes in the Context of Humoral Immune Responses

Palmer

Schmidt-Martin

Dimitrova

et al. 2016

J Virol

Self Cite

View full text Add to dashboard Cite

Hypervariable region 1 (HVR1) of hepatitis C virus (HCV) comprises the first 27 N-terminal amino acid residues of E2. It is classically seen as the most heterogeneous region of the HCV genome. In this study, we assessed HVR1 evolution by using ultradeep pyrosequencing for a cohort of treatment-naive, chronically infected patients over a short, 16-week period. Organization of the sequence set into connected components that represented single nucleotide substitution events revealed a network dominated by highly connected, centrally positioned master sequences. HVR1 phenotypes were observed to be under strong purifying (stationary) and strong positive (antigenic drift) selection pressures, which were coincident with advancing patient age and cirrhosis of the liver. It followed that stationary viromes were dominated by a single HVR1 variant surrounded by minor variants comprised from conservative single amino acid substitution events. We present evidence to suggest that neutralization antibody efficacy was diminished for stationary-virome HVR1 variants. Our results identify the HVR1 network structure during chronic infection as the preferential dominance of a single variant within a narrow sequence space. IMPORTANCEHCV infection is often asymptomatic, and chronic infection is generally well established in advance of initial diagnosis and subsequent treatment. HVR1 can undergo rapid sequence evolution during acute infection, and the variant pool is typically seen to diverge away from ancestral sequences as infection progresses from the acute to the chronic phase. In this report, we describe HVR1 viromes in chronically infected patients that are defined by a dominant epitope located centrally within a narrow variant pool. Our findings suggest that weakened humoral immune activity, as a consequence of persistent chronic infection, allows for the acquisition and maintenance of host-specific adaptive mutations at HVR1 that reflect virus fitness. Hepatitis C virus (HCV) infection is a global health issue and is recognized as a major etiological agent of liver-related diseases (1). It has been estimated that the current prevalence of HCV represents approximately 2% of the global adult (15 years of age and older) population (2). Following transmission, HCV infection may remain asymptomatic for decades, resulting in the majority of infections initially passing undetected (3). It is estimated that up to 4 million Americans are living with the virus, the majority of whom became infected prior to the isolation and identification of the virus (4, 5). Consequently, the U.S. Centers for Disease Control and Prevention now recommend that Americans born from 1945 to 1965 be screened for the presence of the virus notwithstanding the presence of clinical symptoms (3, 5).HCV is a single-stranded positive-sense RNA virus of considerable genomic heterogeneity. A recent reclassification defined the HCV global distribution into 7 genotypes and 67 subtypes, with genotypes 1 and 3 accounting for the majority of infections worldwide (6...

show abstract

“…The obtained reads were further processed with the program PRINSEQ, which depurates pyrosequencing data based on the read quality (Schmieder & Edwards, 2011). After this, the data were filtered by the error correction algorithm implemented in the program KEC, which corrects outlier sequences based on k-mer frequencies and identifies haplotypes based on the corresponding results (Skums et al, 2012). Given that KEC returns a corrected set of sequences, we did not use this output directly.…”

mentioning

confidence: 99%

Virus evolution during chronic hepatitis B virus infection as revealed by ultradeep sequencing data

Jones

Sede

Manrique

et al. 2016

Journal of General Virology

View full text Add to dashboard Cite

Despite chronic hepatitis B virus (HBV) infection (CHB) being a leading cause of liver cirrhosis and cancer, HBV evolution during CHB is not fully understood. Recent studies have indicated that virus diversity progressively increases along the course of CHB and that some virus mutations correlate with severe liver conditions such as chronic hepatitis, cirrhosis and hepatocellular carcinoma. Using ultradeep sequencing (UDS) data from an intrafamilial case, we detected such mutations at low frequencies among three immunotolerant patients and at high frequencies in an inactive carrier. Furthermore, our analyses indicated that the HBV population from the seroconverter patient underwent many genetic changes in response to virus clearance. Together, these data indicate a potential use of UDS for developing non-invasive biomarkers for monitoring disease changes over time or in response to specific therapies. In addition, our analyses revealed that virus clearance seemed not to require the virus effective population size to decline. A detailed genetic analysis of the viral lineages arising during and after the clearance suggested that mutations at or close to critical elements of the core promoter (enhancer II, epsilon encapsidation signal, TA2, TA3 and direct repeat 1-hormone response element) might be responsible for a sustained replication. This hypothesis requires the decline in virus load to be explained by constant clearance of virus-producing hepatocytes, consistent with the sustained progress towards serious liver conditions experienced by many CHB patients.

show abstract

Efficient error correction for next-generation sequencing of viral amplicons

Cited by 95 publications

References 27 publications

Analysis of the Evolution and Structure of a Complex Intrahost Viral Population in Chronic Hepatitis C Virus Mapped by Ultradeep Pyrosequencing

Analysis of the Evolution and Structure of a Complex Intrahost Viral Population in Chronic Hepatitis C Virus Mapped by Ultradeep Pyrosequencing

Network Analysis of the Chronic Hepatitis C Virome Defines Hypervariable Region 1 Evolutionary Phenotypes in the Context of Humoral Immune Responses

Virus evolution during chronic hepatitis B virus infection as revealed by ultradeep sequencing data

Contact Info

Product

Resources

About