The incomplete identification of structural variants (SVs) from whole-genome sequencing data limits studies of human genetic diversity and disease association. Here, we apply a suite of long-read, short-read, strand-specific sequencing technologies, optical mapping, and variant discovery algorithms to comprehensively analyze three trios to define the full spectrum of human genetic variation in a haplotype-resolved manner. We identify 818,054 indel variants (<50 bp) and 27,622 SVs (≥50 bp) per genome. We also discover 156 inversions per genome and 58 of the inversions intersect with the critical regions of recurrent microdeletion and microduplication syndromes. Taken together, our SV callsets represent a three to sevenfold increase in SV detection compared to most standard high-throughput sequencing studies, including those from the 1000 Genomes Project. The methods and the dataset presented serve as a gold standard for the scientific community allowing us to make recommendations for maximizing structural variation sensitivity for future genome sequencing studies.
Genomic association studies of common or rare protein-coding variation have established robust statistical approaches to account for multiple testing. Here, we present a comparable framework to evaluate rare and de novo noncoding single nucleotide variants, insertion/deletions, and all classes of structural variation from whole-genome sequencing (WGS). Integrating genomic annotations at the level of nucleotides, genes, and regulatory regions, we define 51,801 annotation categories. Analyses of 519 autism spectrum disorder families did not identify association with any categories after correction for 4,123 effective tests. Without appropriate correction, biologically plausible associations are observed in both cases and controls. Despite excluding previously identified gene-disrupting mutations, coding regions still exhibited the strongest associations. Thus, in autism the contribution of de novo noncoding variation is probably modest compared to de novo coding variants. Robust results from future WGS studies will require large cohorts and comprehensive analytical strategies that consider the substantial multiple testing burden.
Exocytosis is essential to the lytic cycle of apicomplexan parasites and required for the pathogenesis of toxoplasmosis and malaria. DOC2 proteins recruit the membrane fusion machinery required for exocytosis in a Ca2+-dependent fashion. Here, the phenotype of a Toxoplasma gondii conditional mutant impaired in host cell invasion and egress was pinpointed to a defect in secretion of the micronemes, an apicomplexan-specific organelle that contains adhesion proteins. Whole genome sequencing identified the etiological point mutation in TgDOC2.1. A conditional allele of the orthologous gene engineered into Plasmodium falciparum was also defective in microneme secretion. However, the major effect was on invasion, suggesting microneme secretion is dispensable for Plasmodium egress.
A genetic etiology is identified for one third of congenital heart disease (CHD) patients, including 8% attributable to coding de novo variants (DNVs). To assess the contribution of noncoding DNVs to CHD, we compared genome sequences from 749 CHD probands and their parents with 1,611 unaffected trios. Neural network prediction of noncoding DNV transcriptional impact identified a burden of DNVs in CHD ( n = 2,238 DNVs) compared to controls ( n = 4,177; P = 8.7 × 10 −4 ). Independent analyses of enhancers showed excess DNVs in associated genes (27 genes vs. 3.7 expected, P = 1 × 10 −5 ). We observed significant overlap between these transcription-based approaches (OR = 2.5, 95% CI 1.1–5.0, P = 5.4 × 10 −3 ). CHD DNVs altered transcription levels in five of 31 enhancers assayed. Finally, we observed DNV burden in RNA-binding protein regulatory sites (OR = 1.13, 95% CI 1.1–1.2, P = 8.8 × 10 −5 ). Our findings demonstrate an enrichment of potentially disruptive regulatory noncoding DNVs in a fraction of CHD at least as high as observed for damaging coding DNVs.
Germline mutation rates in humans have been estimated for a variety of mutation types, including single-nucleotide and large structural variants. Here, we directly measure the germline retrotransposition rate for the three active retrotransposon elements: L1, Alu, and SVA. We used three tools for calling mobile element insertions (MEIs) (MELT, RUFUS, and TranSurVeyor) on blood-derived whole-genome sequence (WGS) data from 599 CEPH individuals, comprising 33 threegeneration pedigrees. We identified 26 de novo MEIs in 437 births. The retrotransposition rate estimates for Alu elements, one in 40 births, is roughly half the rate estimated using phylogenetic analyses, a difference in magnitude similar to that observed for single-nucleotide variants. The L1 retrotransposition rate is one in 63 births and is within range of previous estimates (1:20-1:200 births). The SVA retrotransposition rate, one in 63 births, is much higher than the previous estimate of one in 900 births. Our large, three-generation pedigrees allowed us to assess parent-of-origin effects and the timing of insertion events in either gametogenesis or early embryonic development. We find a statistically significant paternal bias in Alu retrotransposition. Our study represents the first in-depth analysis of the rate and dynamics of human retrotransposition from WGS data in three-generation human pedigrees.
Early infantile epileptic encephalopathy (EIEE) is a devastating epilepsy syndrome with onset in the first months of life. Although mutations in more than 50 different genes are known to cause EIEE, current diagnostic yields with gene panel tests or whole-exome sequencing are below 60%. We applied whole-genome analysis (WGA) consisting of whole-genome sequencing and comprehensive variant discovery approaches to a cohort of 14 EIEE subjects for whom prior genetic tests had not yielded a diagnosis. We identified both de novo point and INDEL mutations and de novo structural rearrangements in known EIEE genes, as well as mutations in genes not previously associated with EIEE. The detection of a pathogenic or likely pathogenic mutation in all 14 subjects demonstrates the utility of WGA to reduce the time and costs of clinical diagnosis of EIEE. While exome sequencing may have detected 12 of the 14 causal mutations, 3 of the 12 patients received non-diagnostic exome panel tests prior to genome sequencing. Thus, given the continued decline of sequencing costs, our results support the use of WGA with comprehensive variant discovery as an efficient strategy for the clinical diagnosis of EIEE and other genetic conditions.
BackgroundToxoplasma gondii has a largely clonal population in North America and Europe, with types I, II and III clonal lineages accounting for the majority of strains isolated from patients. RH, a particular type I strain, is most frequently used to characterize Toxoplasma biology. However, compared to other type I strains, RH has unique characteristics such as faster growth, increased extracellular survival rate and inability to form orally infectious cysts. Thus, to identify candidate genes that could account for these parasite phenotypic differences, we determined genetic differences and differential parasite gene expression between RH and another type I strain, GT1. Moreover, as differences in host cell modulation could affect Toxoplasma replication in the host, we determined differentially modulated host processes among the type I strains through host transcriptional profiling.ResultsThrough whole genome sequencing, we identified 1,394 single nucleotide polymorphisms (SNPs) and insertions/deletions (indels) between RH and GT1. These SNPs/indels together with parasite gene expression differences between RH and GT1 were used to identify candidate genes that could account for type I phenotypic differences. A polymorphism in dense granule protein, GRA2, determined RH and GT1 differences in the evasion of the interferon gamma response. In addition, host transcriptional profiling identified that genes regulated by NF-ĸB, such as interleukin (IL)-12p40, were differentially modulated by the different type I strains. We subsequently showed that this difference in NF-ĸB activation was due to polymorphisms in GRA15. Furthermore, we observed that RH, but not other type I strains, recruited phosphorylated IĸBα (a component of the NF-ĸB complex) to the parasitophorous vacuole membrane and this recruitment of p- IĸBα was partially dependent on GRA2.ConclusionsWe identified candidate parasite genes that could be responsible for phenotypic variation among the type I strains through comparative genomics and transcriptomics. We also identified differentially modulated host pathways among the type I strains, and these can serve as a guideline for future studies in examining the phenotypic differences among type I strains.
The simultaneous targeting of host and pathogen processes represents an untapped approach for the treatment of intracellular infections. Hypoxia-inducible factor-1 (HIF-1) is a host cell transcription factor that is activated by and required for the growth of the intracellular protozoan parasite Toxoplasma gondii at physiological oxygen levels. Parasite activation of HIF-1 is blocked by inhibiting the family of closely related Activin-Like Kinase (ALK) host cell receptors ALK4, ALK5, and ALK7, which was determined in part by use of an ALK4,5,7 inhibitor named SB505124. Besides inhibiting HIF-1 activation, SB505124 also potently blocks parasite replication under normoxic conditions. To determine whether SB505124 inhibition of parasite growth was exclusively due to inhibition of ALK4,5,7 or because the drug inhibited a second kinase, SB505124-resistant parasites were isolated by chemical mutagenesis. Whole-genome sequencing of these mutants revealed mutations in the Toxoplasma MAP kinase, TgMAPK1. Allelic replacement of mutant TgMAPK1 alleles into wild-type parasites was sufficient to confer SB505124 resistance. SB505124 independently impacts TgMAPK1 and ALK4,5,7 signaling since drug resistant parasites could not activate HIF-1 in the presence of SB505124 or grow in HIF-1 deficient cells. In addition, TgMAPK1 kinase activity is inhibited by SB505124. Finally, mice treated with SB505124 had significantly lower tissue burdens following Toxoplasma infection. These data therefore identify SB505124 as a novel small molecule inhibitor that acts by inhibiting two distinct targets, host HIF-1 and TgMAPK1.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.