How noncoding DNA determines gene expression in different cell types is a major unsolved problem, and critical downstream applications in human genetics depend on improved solutions. Here, we report substantially improved gene expression prediction accuracy from DNA sequences through the use of a deep learning architecture, called Enformer, that is able to integrate information from long-range interactions (up to 100 kb away) in the genome. This improvement yielded more accurate variant effect predictions on gene expression for both natural genetic variants and saturation mutagenesis measured by massively parallel reporter assays. Furthermore, Enformer learned to predict enhancer–promoter interactions directly from the DNA sequence competitively with methods that take direct experimental data as input. We expect that these advances will enable more effective fine-mapping of human disease associations and provide a framework to interpret cis-regulatory evolution.
The extreme genetic heterogeneity of nonsyndromic hearing loss (NSHL) makes genetic diagnosis expensive and time consuming using available methods. To assess the feasibility of targetenrichment and massively parallel sequencing technologies to interrogate all exons of all genes implicated in NSHL, we tested nine patients diagnosed with hearing loss. Solid-phase (NimbleGen) or solution-based (SureSelect) sequence capture, followed by 454 or Illumina sequencing, respectively, were compared. Sequencing reads were mapped using GSMAPPER, BFAST, and BOWTIE, and pathogenic variants were identified using a custom-variant calling and annotation pipeline (ASAP) that incorporates publicly available in silico pathogenicity prediction tools (SIFT, BLOSUM, Polyphen2, and Align-GVGD). Samples included one negative control, three positive controls (one biological replicate), and six unknowns (10 samples total), in which we genotyped 605 single nucleotide polymorphisms (SNPs) by Sanger sequencing to measure sensitivity and specificity for SureSelect-Illumina and NimbleGen-454 methods at saturating sequence coverage. Causative mutations were identified in the positive controls but not in the negative control. In five of six idiopathic hearing loss patients we identified the pathogenic mutation. Massively parallel sequencing technologies provide sensitivity, specificity, and reproducibility at levels sufficient to perform genetic diagnosis of hearing loss.deafness | genomics | Usher syndrome | diagnostics | next-generation sequencing H ereditary sensorineural hearing loss (SNHL) is the most common sensory impairment in humans (1, 2). In developed countries, two-thirds of prelingual-onset SNHL is estimated to have a genetic etiology, of which ∼70% is nonsyndromic hearing loss (NSHL). Eighty percent of NSHL is autosomal recessive nonsyndromic hearing loss (ARNSHL), ∼20% is autosomal dominant (AD), and the remainder is composed of X-linked and mitochondrial forms (1, 3). To date, 134 deafness loci have been identified, and 32 recessive (DFNB), 23 dominant (DFNA) and 2 X-linked (DFNX) genes have been cloned; 8 genes are associated with both ARNSHL and ADNSHL (4).Establishing a genetic diagnosis of NSHL is a critical component of the clinical evaluation of deaf and hard-of-hearing persons and their families. If a genetic cause of hearing loss is determined, it is possible to provide families with prognostic information, recurrence risks, and improved habilitation options. For persons diagnosed with Usher syndrome, preventative measures including sunlight protection and vitamin therapy can be implemented to minimize the rate of progression of retinitis pigmentosa (5). Most current genetic testing strategies for NSHL rely on a gene-specific Sanger sequencing approach. Because mutations in a single gene, GJB2 (DFNB1), account for up to 50% of ARNSHL in many world populations (6), this approach has changed the evaluation of patients with presumed ARNSHL. However, the mutation frequency in other genes in persons with NSHL in outbred populat...
Background Non-syndromic hearing loss (NSHL) is the most common sensory impairment in humans. Until recently its extreme genetic heterogeneity precluded comprehensive genetic testing. Using a platform that couples targeted genomic enrichment (TGE) and massively parallel sequencing (MPS) to sequence all exons of all genes implicated in NSHL, we test 100 persons with presumed genetic NSHL and in so doing establish sequencing requirements for maximum sensitivity and define MPS quality score metrics that obviate Sanger validation of variants. Methods We examined DNA from 100 sequentially collected probands with presumed genetic NSHL without exclusions due to inheritance, previous genetic testing, or type of hearing loss. We performed TGE using post-capture multiplexing in variable pool sizes followed by Illumina sequencing. We developed a local Galaxy installation on a high performance-computing cluster for bioinformatics analysis. Results To obtain maximum variant sensitivity with this platform 3.2–6.3 million total mapped sequencing reads per sample are required. Quality score analysis showed that Sanger validation is not required for 95% of variants. Our overall diagnostic rate was 42% but varied by clinical features from 0% for persons with asymmetric hearing loss to 56% for persons with bilateral autosomal recessive NSHL. Conclusions These findings will direct the use of TGE and MPS strategies for genetic diagnosis for NSHL. Our diagnostic rate highlights the need for further research on genetic deafness focused on novel gene identification and an improved understanding of the role of non-exonic mutations. The unsolved families we have identified provide a valuable resource to address these areas.
The prevalence of DFNA8/DFNA12 (DFNA8/12), a type of autosomal dominant non-syndromic hearing loss (ADNSHL), is unknown as comprehensive population-based genetic screening has not been conducted. We therefore completed unbiased screening for TECTA mutations in a Spanish cohort of 372 probands from ADNSHL families. Three additional families (Spanish, Belgian and English) known to be linked to DFNA8/12 were also included in the screening. In an additional cohort of 835 American ADNSHL families, we preselected 73 probands for TECTA screening based on audiometric data. In aggregate, we identified 23 TECTA mutations in this process. Remarkably 20 of these mutations are novel, more than doubling the number of reported TECTA ADNSHL mutations from 13 to 33. Mutations lie in all domains of the α-tectorin protein, including those for the first time identified in the entactin domain, the vWFD1, vWFD2 and vWFD3 repeats, and the D1-D2 and TIL2 connectors. While the majority are private mutations, four of them – p.Cys1036Tyr, p.Cys1837Gly, p.Thr1866Met and p.Arg1890Cys – were observed in more than one unrelated family. For two of these mutations founder effects were also confirmed. Our data validate previously observed genotype-phenotype correlations in DFNA8/12 and introduce new correlations. Specifically, mutations in the N-terminal region of α-tectorin (entactin domain, vWFD1 and vWFD2) lead to mid frequency NSHL, a phenotype previously associated only with mutations in the ZP domain. Collectively, our results indicate that DFNA8/12 hearing loss is a frequent type of ADNSHL.
The naked mole-rat (Heterocephalus glaber) is widely acclaimed to be cancer-resistant and of considerable research interest based on a paucity of reports of neoplasia in this species. We have, however, encountered four spontaneous cases of neoplasia and one presumptive case of neoplasia through routine necropsy and biopsy of individuals in a zoo collection of nonhybrid naked mole-rats bred from a single pair. One case each of metastasizing hepatocellular carcinoma, nephroblastoma (Wilms' tumor), and multicentric lymphosarcoma, as well as presumptive esophageal adenocarcinoma (Barrett's esophagus-like) was identified postmortem among 37 nonautolyzed necropsy submissions of naked mole-rats over 1-year-old that were submitted for necropsy between 1998 and August 2015. One incidental case of cutaneous hemangioma was also identified antemortem by skin biopsy from one naked mole-rat examined for trauma.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.