BackgroundTo facilitate the clinical implementation of genomic medicine by next-generation sequencing, it will be critically important to obtain accurate and consistent variant calls on personal genomes. Multiple software tools for variant calling are available, but it is unclear how comparable these tools are or what their relative merits in real-world scenarios might be.MethodsWe sequenced 15 exomes from four families using commercial kits (Illumina HiSeq 2000 platform and Agilent SureSelect version 2 capture kit), with approximately 120X mean coverage. We analyzed the raw data using near-default parameters with five different alignment and variant-calling pipelines (SOAP, BWA-GATK, BWA-SNVer, GNUMAP, and BWA-SAMtools). We additionally sequenced a single whole genome using the sequencing and analysis pipeline from Complete Genomics (CG), with 95% of the exome region being covered by 20 or more reads per base. Finally, we validated 919 single-nucleotide variations (SNVs) and 841 insertions and deletions (indels), including similar fractions of GATK-only, SOAP-only, and shared calls, on the MiSeq platform by amplicon sequencing with approximately 5000X mean coverage.ResultsSNV concordance between five Illumina pipelines across all 15 exomes was 57.4%, while 0.5 to 5.1% of variants were called as unique to each pipeline. Indel concordance was only 26.8% between three indel-calling pipelines, even after left-normalizing and intervalizing genomic coordinates by 20 base pairs. There were 11% of CG variants falling within targeted regions in exome sequencing that were not called by any of the Illumina-based exome analysis pipelines. Based on targeted amplicon sequencing on the MiSeq platform, 97.1%, 60.2%, and 99.1% of the GATK-only, SOAP-only and shared SNVs could be validated, but only 54.0%, 44.6%, and 78.1% of the GATK-only, SOAP-only and shared indels could be validated. Additionally, our analysis of two families (one with four individuals and the other with seven), demonstrated additional accuracy gained in variant discovery by having access to genetic data from a multi-generational family.ConclusionsOur results suggest that more caution should be exercised in genomic medicine settings when analyzing individual genomes, including interpreting positive and negative findings with scrutiny, especially for indels. We advocate for renewed collection and sequencing of multi-generational families to increase the overall accuracy of whole genomes.
The Greenlandic population, a small and historically isolated founder population comprising about 57,000 inhabitants, has experienced a dramatic increase in type 2 diabetes (T2D) prevalence during the past 25 years. Motivated by this, we performed association mapping of T2D-related quantitative traits in up to 2,575 Greenlandic individuals without known diabetes. Using array-based genotyping and exome sequencing, we discovered a nonsense p.Arg684Ter variant (in which arginine is replaced by a termination codon) in the gene TBC1D4 with an allele frequency of 17%. Here we show that homozygous carriers of this variant have markedly higher concentrations of plasma glucose (β = 3.8 mmol l(-1), P = 2.5 × 10(-35)) and serum insulin (β = 165 pmol l(-1), P = 1.5 × 10(-20)) 2 hours after an oral glucose load compared with individuals with other genotypes (both non-carriers and heterozygous carriers). Furthermore, homozygous carriers have marginally lower concentrations of fasting plasma glucose (β = -0.18 mmol l(-1), P = 1.1 × 10(-6)) and fasting serum insulin (β = -8.3 pmol l(-1), P = 0.0014), and their T2D risk is markedly increased (odds ratio (OR) = 10.3, P = 1.6 × 10(-24)). Heterozygous carriers have a moderately higher plasma glucose concentration 2 hours after an oral glucose load than non-carriers (β = 0.43 mmol l(-1), P = 5.3 × 10(-5)). Analyses of skeletal muscle biopsies showed lower messenger RNA and protein levels of the long isoform of TBC1D4, and lower muscle protein levels of the glucose transporter GLUT4, with increasing number of p.Arg684Ter alleles. These findings are concomitant with a severely decreased insulin-stimulated glucose uptake in muscle, leading to postprandial hyperglycaemia, impaired glucose tolerance and T2D. The observed effect sizes are several times larger than any previous findings in large-scale genome-wide association studies of these traits and constitute further proof of the value of conducting genetic association studies outside the traditional setting of large homogeneous populations.
Disseminated superficial actinic porokeratosis (DSAP) is an autosomal dominantly inherited epidermal keratinization disorder whose etiology remains unclear. We performed exome sequencing in one unaffected and two affected individuals from a DSAP family. The mevalonate kinase gene (MVK) emerged as the only candidate gene located in previously defined linkage regions after filtering against existing SNP databases, eight HapMap exomes and 1000 Genomes Project data and taking into consideration the functional implications of the mutations. Sanger sequencing in 57 individuals with familial DSAP and 25 individuals with sporadic DSAP identified MVK mutations in 33% and 16% of these individuals (cases), respectively. All 14 MVK mutations identified in our study were absent in 676 individuals without DSAP. Our functional studies in cultured primary keratinocytes suggest that MVK has a role in regulating calcium-induced keratinocyte differentiation and could protect keratinocytes from apoptosis induced by type A ultraviolet radiation. Our results should help advance the understanding of DSAP pathogenesis.
Non-obstructive azoospermia (NOA), a severe form of male infertility, is often suspected to be linked to currently undefined genetic abnormalities. To explore the genetic basis of this condition, we successfully sequenced ~650 infertility-related genes in 757 NOA patients and 709 fertile males. We evaluated the contributions of rare variants to the etiology of NOA by identifying individual genes showing nominal associations and testing the genetic burden of a given biological process as a whole. We found a significant excess of rare, non-silent variants in genes that are key epigenetic regulators of spermatogenesis, such as BRWD1, DNMT1, DNMT3B, RNF17, UBR2, USP1 and USP26, in NOA patients (P = 5.5 × 10−7), corresponding to a carrier frequency of 22.5% of patients and 13.7% of controls (P = 1.4 × 10−5). An accumulation of low-frequency variants was also identified in additional epigenetic genes (BRDT and MTHFR). Our study suggested the potential associations of genetic defects in genes that are epigenetic regulators with spermatogenic failure in human.
BackgroundMarie Unna hereditary hypotrichosis (MUHH) is an autosomal dominant disorder characterised by coarse, wiry, twisted hair developed in early childhood and subsequent progressive hair loss. MUHH is a genetically heterogeneous disorder. No gene in 1p21.1–1q21.3 region responsible for MUHH has been identified.MethodsExome sequencing was performed on two affected subjects, who had normal vertex hair and modest alopecia, and one unaffected individual from a four-generation MUHH family of which our previous linkage study mapped the MUHH locus on chromosome 1p21.1–1q21.3.ResultsWe identified a missense mutation in EPS8L3 (NM_024526.3: exon2: c.22G->A:p.Ala8Thr) within 1p21.1–1q21.3. Sanger sequencing confirmed the cosegregation of this mutation with the disease phenotype in the family by demonstrating the presence of the heterozygous mutation in all the eight affected and absence in all the seven unaffected individuals. This mutation was found to be absent in 676 unrelated healthy controls and 781 patients of other disease from another unpublished project of our group.ConclusionsTaken together, our results suggest that EPS8L3 is a causative gene for MUHH, which was helpful for advancing us on understanding of the pathogenesis of MUHH. Our study also has further demonstrated the effectiveness of combining exome sequencing with linkage information for identifying Mendelian disease genes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.