De novo mutations (DNMs) are important in autism spectrum disorder (ASD), but so far analyses have mainly been on the ~1.5% of the genome encoding genes. Here, we performed whole-genome sequencing (WGS) of 200 ASD parent–child trios and characterised germline and somatic DNMs. We confirmed that the majority of germline DNMs (75.6%) originated from the father, and these increased significantly with paternal age only (P=4.2×10−10). However, when clustered DNMs (those within 20 kb) were found in ASD, not only did they mostly originate from the mother (P=7.7×10−13), but they could also be found adjacent to de novo copy number variations where the mutation rate was significantly elevated (P=2.4×10−24). By comparing with DNMs detected in controls, we found a significant enrichment of predicted damaging DNMs in ASD cases (P=8.0×10−9; odds ratio=1.84), of which 15.6% (P=4.3×10−3) and 22.5% (P=7.0×10−5) were non-coding or genic non-coding, respectively. The non-coding elements most enriched for DNM were untranslated regions of genes, regulatory sequences involved in exon-skipping and DNase I hypersensitive regions. Using microarrays and a novel outlier detection test, we also found aberrant methylation profiles in 2/185 (1.1%) of ASD cases. These same individuals carried independently identified DNMs in the ASD-risk and epigenetic genes DNMT3A and ADNP. Our data begins to characterize different genome-wide DNMs, and highlight the contribution of non-coding variants, to the aetiology of ASD.
The COVID-19 pandemic has accounted for millions of infections and hundreds of thousand deaths worldwide in a short-time period. The patients demonstrate a great diversity in clinical and laboratory manifestations and disease severity. Nonetheless, little is known about the host genetic contribution to the observed interindividual phenotypic variability. Here, we report the first host genetic study in the Chinese population by deeply sequencing and analyzing 332 COVID-19 patients categorized by varying levels of severity from the Shenzhen Third People’s Hospital. Upon a total of 22.2 million genetic variants, we conducted both single-variant and gene-based association tests among five severity groups including asymptomatic, mild, moderate, severe, and critical ill patients after the correction of potential confounding factors. Pedigree analysis suggested a potential monogenic effect of loss of function variants in GOLGA3 and DPP7 for critically ill and asymptomatic disease demonstration. Genome-wide association study suggests the most significant gene locus associated with severity were located in TMEM189–UBE2V1 that involved in the IL-1 signaling pathway. The p.Val197Met missense variant that affects the stability of the TMPRSS2 protein displays a decreasing allele frequency among the severe patients compared to the mild and the general population. We identified that the HLA-A*11:01, B*51:01, and C*14:02 alleles significantly predispose the worst outcome of the patients. This initial genomic study of Chinese patients provides genetic insights into the phenotypic difference among the COVID-19 patient groups and highlighted genes and variants that may help guide targeted efforts in containing the outbreak. Limitations and advantages of the study were also reviewed to guide future international efforts on elucidating the genetic architecture of host–pathogen interaction for COVID-19 and other infectious and complex diseases.
BackgroundStructural variants (SVs) are less common than single nucleotide polymorphisms and indels in the population, but collectively account for a significant fraction of genetic polymorphism and diseases. Base pair differences arising from SVs are on a much higher order (>100 fold) than point mutations; however, none of the current detection methods are comprehensive, and currently available methodologies are incapable of providing sufficient resolution and unambiguous information across complex regions in the human genome. To address these challenges, we applied a high-throughput, cost-effective genome mapping technology to comprehensively discover genome-wide SVs and characterize complex regions of the YH genome using long single molecules (>150 kb) in a global fashion.ResultsUtilizing nanochannel-based genome mapping technology, we obtained 708 insertions/deletions and 17 inversions larger than 1 kb. Excluding the 59 SVs (54 insertions/deletions, 5 inversions) that overlap with N-base gaps in the reference assembly hg19, 666 non-gap SVs remained, and 396 of them (60%) were verified by paired-end data from whole-genome sequencing-based re-sequencing or de novo assembly sequence from fosmid data. Of the remaining 270 SVs, 260 are insertions and 213 overlap known SVs in the Database of Genomic Variants. Overall, 609 out of 666 (90%) variants were supported by experimental orthogonal methods or historical evidence in public databases. At the same time, genome mapping also provides valuable information for complex regions with haplotypes in a straightforward fashion. In addition, with long single-molecule labeling patterns, exogenous viral sequences were mapped on a whole-genome scale, and sample heterogeneity was analyzed at a new level.ConclusionOur study highlights genome mapping technology as a comprehensive and cost-effective method for detecting structural variation and studying complex regions in the human genome, as well as deciphering viral integration into the host genome.Electronic supplementary materialThe online version of this article (doi:10.1186/2047-217X-3-34) contains supplementary material, which is available to authorized users.
The gut microbiome has been implicated in a variety of physiological states, but controversy over causality remains unresolved. Here, we performed bidirectional Mendelian randomization (MR) analyses on 3,432 Chinese individuals with whole genome, whole metagenome, anthropometric, and blood metabolic trait data. We identified 58 causal relationships between the gut microbiome and blood metabolites, and replicated 43 of them. Increased relative abundances of fecal Oscillibacter and Alistipes were causally linked to decreased triglyceride concentration. Conversely, blood metabolites such as glutamic acid appeared to decrease fecal Oxalobacter, and members of Proteobacteria were influenced by metabolites such as 5-methyltetrahydrofolic acid, alanine, glutamate, and selenium. Two-sample MR with data from Biobank Japan partly corroborated results with triglyceride and with uric acid, and also provided causal support for published fecal bacterial markers for cancer and cardiovascular diseases. This study illustrates the value of human genetic information to help prioritize gut microbial features for mechanistic and clinical studies.Metagenome-wide association studies (MWAS) using human stool samples, as well as animal models, especially germ-free mice, have pointed to a potential role of the gut microbiome in diseases such as cardiometabolic, autoimmune, neuropsychiatric disorders and cancer, with mechanistic investigations for diseases such as obesity, colorectal cancer and schizophrenia [1][2][3][4] . Twin-based heritability estimation and more recent metagenome-genome-wide association studies (M-GWAS) have questioned the traditional view of the gut microbiota as a purely environmental factor 5-9 , although the extent of the genetic influence remains controversial 7,10 . Yet, all these published cohorts, except for human sequences in the metagenomic data of HMP (Human Microbiome Project), utilized array data for human genetics, and most of them had 16S rRNA gene amplicon sequencing for the fecal microbiota [5][6][7][8][9] .As the gut microbiome is considered to be highly dynamic, causality has been an unresolved issue in the field. Mendelian randomization (MR) 11 offers an opportunity to distinguish between causal and non-causal effects from cross-sectional data, without animal studies or randomized controlled trials. An early study used MR to look at the gut microbiota and ischemic heart disease 12 . Recently, a study used MR to confirm that increased relative abundance of bacteria producing the fecal volatile short-chain fatty acid (SCFA) butyrate was causally linked to improved insulin response to oral glucose challenge; in contrast, another fecal SCFA, propionate, was causally related to an increased risk of T2D 13 . However, both studies used genotype data, and it was not clear to what extent the genetic factors explained the microbial feature of interest.In this study, we present a large-scale M-GWAS using whole genome and fecal microbiome, followed by bidirectional MR for the fecal microbiome and anthropometric...
The human genome is diploid, and knowledge of the variants on each chromosome is important for the interpretation of genomic information. Here we report the assembly of a haplotype-resolved diploid genome without using a reference genome. Our pipeline relies on fosmid pooling together with whole-genome shotgun strategies, based solely on next-generation sequencing and hierarchical assembly methods. We applied our sequencing method to the genome of an Asian individual and generated a 5.15-Gb assembled genome with a haplotype N50 of 484 kb. Our analysis identified previously undetected indels and 7.49 Mb of novel coding sequences that could not be aligned to the human reference genome, which include at least six predicted genes. This haplotype-resolved genome represents the most complete de novo human genome assembly to date. Application of our approach to identify individual haplotype differences should aid in translating genotypes to phenotypes for the development of personalized medicine.
The COVID-19 pandemic has accounted for more than five million infections and hundreds of thousand deaths worldwide in the past six months. The patients demonstrate a great diversity in clinical and laboratory manifestations and disease severity. Nonetheless, little is known about the host genetic contribution to the observed inter-individual phenotypic variability. Here, we report the first host genetic study in China by deeply sequencing and analyzing the 332 COVID-19 patients categorized by varying levels of severity from the Shenzhen Third People's Hospital. Based on a total of 22.2 million genetic variants, we conducted both single-variant and gene-based association tests among the five severity groups including asymptomatic, mild, moderate, severe and critical ill patients after the correction of potential confounding factors. The most significant gene loci associated with severity is located in TMEM189-UBE2V1 involved in the IL-1 signaling pathway. The p.Val197Met missense variant that affects the stability of the TMPRSS2 protein displays a decreasing allele frequency among the severe patients compared to the mild and the general population. We also identified that the HLA-A*11:01, B*51:01 and C*14:02 alleles significantly predispose the worst outcome of the patients. This initial study of Chinese patients provides a comprehensive view of the genetic difference among the COVID-19 patient groups and highlighted genes and variants that may help guide targeted efforts in containing the outbreak. Limitations and advantages of the study were also reviewed to guide future international efforts on elucidating the genetic architecture of host-pathogen interaction for COVID-19 and other infectious and complex diseases.
The major histocompatibility complex (MHC) is one of the most variable and gene-dense regions of the human genome. Most studies of the MHC, and associated regions, focus on minor variants and HLA typing, many of which have been demonstrated to be associated with human disease susceptibility and metabolic pathways. However, the detection of variants in the MHC region, and diagnostic HLA typing, still lacks a coherent, standardized, cost effective and high coverage protocol of clinical quality and reliability. In this paper, we presented such a method for the accurate detection of minor variants and HLA types in the human MHC region, using high-throughput, high-coverage sequencing of target regions. A probe set was designed to template upon the 8 annotated human MHC haplotypes, and to encompass the 5 megabases (Mb) of the extended MHC region. We deployed our probes upon three, genetically diverse human samples for probe set evaluation, and sequencing data show that ∼97% of the MHC region, and over 99% of the genes in MHC region, are covered with sufficient depth and good evenness. 98% of genotypes called by this capture sequencing prove consistent with established HapMap genotypes. We have concurrently developed a one-step pipeline for calling any HLA type referenced in the IMGT/HLA database from this target capture sequencing data, which shows over 96% typing accuracy when deployed at 4 digital resolution. This cost-effective and highly accurate approach for variant detection and HLA typing in the MHC region may lend further insight into immune-mediated diseases studies, and may find clinical utility in transplantation medicine research. This one-step pipeline is released for general evaluation and use by the scientific community.
The gut microbiome has been established as a key environmental factor to health. Genetic influences on the gut microbiome have been reported, yet, doubts remain as to the significance of genetic associations. Here, we provide shotgun data for whole genome and whole metagenome from a Chinese cohort, identifying no <20% genetic contribution to the gut microbiota. Using common variants-, rare variants-, and copy number variations-based association analyses, we identified abundant signals associated with the gut microbiome especially in metabolic, neurological, and immunological functions. The controversial concept of enterotypes may have a genetic attribute, with the top two loci explaining 11% of the Prevotella–Bacteroides variances. Stratification according to gender led to the identification of differential associations in males and females. Our two-stage metagenome genome-wide association studies on a total of 1295 individuals unequivocally illustrates that neither microbiome nor GWAS studies could overlook one another in our quest for a better understanding of human health and diseases.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.