DNA sequence information underpins genetic research, enabling discoveries of important biological or medical benefit. Sequencing projects have traditionally employed long (400–800 bp) reads, but the existence of reference sequences for the human and many other genomes makes it possible to develop new, fast approaches to re-sequencing, whereby shorter reads are compared to a reference to identify intra-species genetic variation. We report an approach that generates several billion bases of accurate nucleotide sequence per experiment at low cost. Single molecules of DNA are attached to a flat surface, amplified
in situ
and used as templates for synthetic sequencing with fluorescent reversible terminator deoxyribonucleotides. Images of the surface are analysed to generate high quality sequence. We demonstrate application of this approach to human genome sequencing on flow-sorted X chromosomes and then scale the approach to determine the genome sequence of a male Yoruba from Ibadan, Nigeria. We build an accurate consensus sequence from >30x average depth of paired 35-base reads. We characterise four million SNPs and four hundred thousand structural variants, many of which are previously unknown. Our approach is effective for accurate, rapid and economical whole genome re-sequencing and many other biomedical applications.
The molecular pathogenesis of renal cell carcinoma (RCC) is poorly
understood. Whole-genome and exome sequencing followed by innovative tumorgraft
analyses (to accurately determine mutant allele ratios) identified several
putative two-hit tumor suppressor genes including BAP1. BAP1, a
nuclear deubiquitinase, is inactivated in 15% of clear-cell RCCs. BAP1
cofractionates with and binds to HCF-1 in tumorgrafts. Mutations disrupting the
HCF-1 binding motif impair BAP1-mediated suppression of cell proliferation, but
not H2AK119ub1 deubiquitination. BAP1 loss sensitizes RCC cells in
vitro to genotoxic stress. Interestingly, BAP1 and
PBRM1 mutations anticorrelate in tumors
(P=3×10−5), and combined loss of
BAP1 and PBRM1 in a few RCCs was associated with rhabdoid features
(q=0.0007). BAP1 and PBRM1 regulate seemingly different
gene expression programs, and BAP1 loss was associated with high tumor grade
(q=0.0005). Our results establish the foundation for an
integrated pathological and molecular genetic classification of RCC, paving the
way for subtype-specific treatments exploiting genetic vulnerabilities.
While mutations affecting protein-coding regions have been examined across many cancers, structural variants at the genome-wide level are still poorly defined. Through integrative deep whole-genome and -transcriptome analysis of 101 castration-resistant prostate cancer metastases (109X tumor/38X normal coverage), we identified structural variants altering critical regulators of tumorigenesis and progression not detectable by exome approaches. Notably, we observed amplification of an intergenic enhancer region 624 kb upstream of the androgen receptor (AR) in 81% of patients, correlating with increased AR expression. Tandem duplication hotspots also occur near MYC, in lncRNAs associated with post-translational MYC regulation. Classes of structural variations were linked to distinct DNA repair deficiencies, suggesting their etiology, including associations of CDK12 mutation with tandem duplications, TP53 inactivation with inverted rearrangements and chromothripsis, and BRCA2 inactivation with deletions. Together, these observations provide a comprehensive view of how structural variations affect critical regulators in metastatic prostate cancer.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.