SUMMARY
Although most cervical human papillomavirus type 16 (HPV16) infections become undetectable within 1–2 years, persistent HPV16 causes half of all cervical cancers. We used a novel HPV whole-genome sequencing technique to evaluate an exceptionally large collection of 5,570 HPV16-infected case-control samples to determine whether viral genetic variation influences risk of cervical precancer and cancer. We observed thousands of unique HPV16 genomes; very few women shared the identical HPV16 sequence, which should stimulate a careful re-evaluation of the clinical implications of HPV mutation rates, transmission, clearance, and persistence. In case-control analyses, HPV16 in the controls had significantly more amino acid changing variants throughout the genome. Strikingly, E7 was devoid of variants in precancers/cancers compared to higher levels in the controls; we confirmed this in cancers from around the world. Strict conservation of the 98 amino acids of E7, which disrupts Rb function, is critical for HPV16 carcinogenesis, presenting a highly specific target for etiologic and therapeutic research.
Specific HPV16 variant sublineages strongly influence risk of histologic types of precancer and cancer, and viral genetic variation may help explain its unique carcinogenic properties.
For unknown reasons, there is huge variability in risk conferred by different HPV types and, remarkably, strong differences even between closely related variant lineages within each type. HPV16 is a uniquely powerful carcinogenic type, causing approximately half of cervical cancer and most other HPV-related cancers. To permit the large-scale study of HPV genome variability and precancer/cancer, starting with HPV16 and cervical cancer, we developed a high-throughput next-generation sequencing (NGS) whole-genome method. We designed a custom HPV16 AmpliSeq™ panel that generated 47 overlapping amplicons covering 99% of the genome sequenced on the Ion Torrent Proton platform. After validating with Sanger, the current “gold standard” of sequencing, in 89 specimens with concordance of 99.9%, we used our NGS method and custom annotation pipeline to sequence 796 HPV16-positive exfoliated cervical cell specimens. The median completion rate per sample was 98.0%.
Our method enabled us to discover novel SNPs, large contiguous deletions suggestive of viral integration (OR of 27.3, 95% CI 3.3–222, P=0.002), and the sensitive detection of variant lineage coinfections. This method represents an innovative high-throughput, ultra-deep coverage technique for HPV genomic sequencing, which, in turn, enables the investigation of the role of genetic variation in HPV epidemiology and carcinogenesis.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.