The detection of single nucleotide polymorphisms (SNPs) and insertion/deletions (indels) with precision from high-throughput data remains a significant bioinformatics challenge. Accurate detection is necessary before next-generation sequencing can routinely be used in the clinic. In research, scientific advances are inhibited by gaps in data, exemplified by the underrepresented discovery of rare variants, variants in non-coding regions and indels. The continued presence of false positives and false negatives prevents full automation and requires additional manual verification steps. Our methodology presents applications of both pattern recognition and sensitivity analysis to eliminate false positives and aid in the detection of SNP/indel loci and genotypes from high-throughput data. We chose FK506-binding protein 51(FKBP5) (6p21.31) for our clinical target because of its role in modulating pharmacological responses to physiological and synthetic glucocorticoids and because of the complexity of the genomic region. We detected genetic variation across a160 kb region encompassing FKBP5. 613 SNPs and 57 indels, including a 3.3 kb deletion were discovered. We validated our method using three independent data sets and, with Sanger sequencing and Affymetrix and Illumina microarrays, achieved 99% concordance. Furthermore we were able to detect 267 novel rare variants and assess linkage disequilibrium. Our results showed both a sensitivity and specificity of 98%, indicating near perfect classification between true and false variants. The process is scalable and amenable to automation, with the downstream filters taking only 1.5 hours to analyze 96 individuals simultaneously. We provide examples of how our level of precision uncovered the interactions of multiple loci, their predicted influences on mRNA stability, perturbations of the hsp90 binding site, and individual variation in FKBP5 expression. Finally we show how our discovery of rare variants may change current conceptions of evolution at this locus.
Here we present an open-source R package 'meaRtools' that provides a platform for analyzing neuronal networks recorded on Multi-Electrode Arrays (MEAs). Cultured neuronal networks monitored with MEAs are now being widely used to characterize in vitro models of neurological disorders and to evaluate pharmaceutical compounds. meaRtools provides core algorithms for MEA spike train analysis, feature extraction, statistical analysis and plotting of multiple MEA recordings with multiple genotypes and treatments. meaRtools functionality covers novel solutions for spike train analysis, including algorithms to assess electrode cross-correlation using the spike train tiling coefficient (STTC), mutual information, network burst synchronization and entropy within cultured wells. Also integrated is a solution to account for bursts variability originating from mixed-cell neuronal cultures. The package provides a statistical platform built specifically for MEA data that can combine multiple MEA recordings and compare extracted features between different genetic models or treatments. We demonstrate the utilization of meaRtools to successfully identify epilepsy-like phenotypes in neuronal networks from Celf4 KO mice as well as the pharmacological correction of phenotypes. The package is freely available under the GPL license (GPL>=3) and is updated frequently on the CRAN web-server repository. The package, along with full documentation can be downloaded from: https://cran.r-project.org/web/packages/meaRtools/.
Deep vein thrombosis and pulmonary embolism, collectively referred to as venous thromboembolism (VTE), are the third leading cause of cardiovascular death in the United States. Genetic factors account for 50-60% of VTE risk and a recent meta-analysis of genome-wide association studies confirmed that common variants in F5, ABO, and seven other loci are associated with VTE. Rare mutations in the anticoagulant genes PROC, PROS1 and SERPINC1 have been linked to VTE in family studies. In order to identify new genetic variants altering the risk for VTE, we performed whole exome sequencing (WES) in 373 unrelated individuals of European ancestry with unprovoked VTE and compared results to a previously sequenced control cohort of 5784 unrelated Europeans. To avoid variant calling bias, only SNVs from exons with >10X coverage and less than 5% difference in coverage between cases and controls were included, removing 11,813 of 188,689 intervals. We used an emerging framework for a "collapsing" analysis on genes, defining qualifying variants on the basis of annotation and minor allele frequency <0.05%, and assumed a dominant model of inheritance. Tests were performed via a Fisher's exact test for a total of 11,585 CCDS genes. Strikingly, ranked by p-value, the top 4 genes were PROS1 (P= 2.01E-09, OR 11.8), STAB2 (P=2.70E-7, OR 3.37), PROC (P=3.24E-05, OR 11.0) and SERPINC1 (P=1.10E-04, OR 8.5). We detected 29 qualifying variants in 373 cases and 106 qualifying variants in 5784 controls in STAB2. This gene encodes Stabilin-2, which is a transmembrane glycoprotein scavenger receptor. Common variants at ABO are associated with VTE and are also known to regulate von Willebrand Factor (VWF) and coagulation Factor VIII (F8) plasma levels. Common variants in STAB2 are also associated with VWF/F8 levels in a large GWAS and with VTE risk in a smaller candidate gene study, suggesting that haploinsufficiency for Stabilin-2 may increase VTE risk through elevated levels of VWF/F8. Although replication and functional testing of these findings is warranted, this study demonstrates the utility of collapsing analyses using WES data to identify multiple loci harboring an excess of rare variants in individuals with a common complex disease trait. Disclosures Ginsburg: Shire: Equity Ownership, Membership on an entity's Board of Directors or advisory committees, Patents & Royalties: recombinant VWF and recombinant ADAMTS13; Portola Pharmaceuticals: Membership on an entity's Board of Directors or advisory committees.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.