Next Generation Sequencing studies generate a large quantity of genetic data in a relatively cost and time efficient manner and provide an unprecedented opportunity to identify candidate causative variants that lead to disease phenotypes. A challenge to these studies is the generation of sequencing artifacts by current technologies. To identify and characterize the properties that distinguish false positive variants from true variants, we sequenced a child and both parents (one trio) using DNA isolated from three sources (blood, buccal cells, and saliva). The trio strategy allowed us to identify variants in the proband that could not have been inherited from the parents (Mendelian errors) and would most likely indicate sequencing artifacts. Quality control measurements were examined and three measurements were found to identify the greatest number of Mendelian errors. These included read depth, genotype quality score, and alternate allele ratio. Filtering the variants on these measurements removed ~95% of the Mendelian errors while retaining 80% of the called variants. These filters were applied independently. After filtering, the concordance between identical samples isolated from different sources was 99.99% as compared to 87% before filtering. This high concordance suggests that different sources of DNA can be used in trio studies without affecting the ability to identify causative polymorphisms. To facilitate analysis of next generation sequencing data, we developed the Cincinnati Analytical Suite for Sequencing Informatics (CASSI) to store sequencing files, metadata (eg. relatedness information), file versioning, data filtering, variant annotation, and identify candidate causative polymorphisms that follow either de novo, rare recessive homozygous or compound heterozygous inheritance models. We conclude the data cleaning process improves the signal to noise ratio in terms of variants and facilitates the identification of candidate disease causative polymorphisms.
Systemic Lupus Erythematosus (SLE or lupus) (OMIM: 152700) is a chronic autoimmune disease with debilitating inflammation that affects multiple organ systems. The STAT1-STAT4 locus is one of the first and most highly-replicated genetic loci associated with lupus risk. We performed a fine-mapping study to identify plausible causal variants within the STAT1-STAT4 locus associated with increased lupus disease risk. Using complementary frequentist and Bayesian approaches in trans-ancestral Discovery and Replication cohorts, we found one variant whose association with lupus risk is supported across ancestries in both the Discovery and Replication cohorts: rs11889341. In B cell lines from patients with lupus and healthy controls, the lupus risk allele of rs11889341 was associated with increased STAT1 expression. We demonstrated that the transcription factor HMGA1, a member of the HMG transcription factor family with an AT-hook DNA-binding domain, has enriched binding to the risk allele compared to the non-risk allele of rs11889341. We identified a genotype-dependent repressive element in the DNA within the intron of STAT4 surrounding rs11889341. Consistent with expression quantitative trait locus (eQTL) analysis, the lupus risk allele of rs11889341 decreased the activity of this putative repressor. Altogether, we present a plausible molecular mechanism for increased lupus risk at the STAT1-STAT4 locus in which the risk allele of rs11889341, the most probable causal variant, leads to elevated STAT1 expression in B cells due to decreased repressor activity mediated by increased binding of HMGA1.
Genome wide association studies have identified variants in PXK that confer risk for humoral autoimmune diseases, including systemic lupus erythematosus (SLE or lupus), rheumatoid arthritis and more recently systemic sclerosis. While PXK is involved in trafficking of epidermal growth factor Receptor (EGFR) in COS-7 cells, mechanisms linking PXK to lupus pathophysiology have remained undefined. In an effort to uncover the mechanism at this locus that increases lupus-risk, we undertook a fine-mapping analysis in a large multi-ancestral study of lupus patients and controls. We define a large (257kb) common haplotype marking a single causal variant that confers lupus risk detected only in European ancestral populations and spans the promoter through the 3′ UTR of PXK. The strongest association was found at rs6445972 with P < 4.62 × 10−10, OR 0.81 (0.75–0.86). Using stepwise logistic regression analysis, we demonstrate that one signal drives the genetic association in the region. Bayesian analysis confirms our results, identifying a 95% credible set consisting of 172 variants spanning 202 kb. Functionally, we found that PXK operates on the B-cell antigen receptor (BCR); we confirmed that PXK influenced the rate of BCR internalization. Furthermore, we demonstrate that individuals carrying the risk haplotype exhibited a decreased rate of BCR internalization, a process known to impact B cell survival and cell fate. Taken together, these data define a new candidate mechanism for the genetic association of variants around PXK with lupus risk and highlight the regulation of intracellular trafficking as a genetically regulated pathway mediating human autoimmunity.
Population and family-based genetic studies typically result in the identification of genetic variants that are statistically associated with a clinical disease or phenotype. For many diseases and traits, most variants are non-coding, and are thus likely to act by impacting subtle, comparatively hard to predict mechanisms controlling gene expression. Here, we describe a general strategic approach to prioritize non-coding variants, and screen them for their function. This approach involves computational prioritization using functional genomic databases followed by experimental analysis of differential binding of transcription factors (TFs) to risk and non-risk alleles. For both electrophoretic mobility shift assay (EMSA) and DNA affinity precipitation assay (DAPA) analysis of genetic variants, a synthetic DNA oligonucleotide (oligo) is used to identify factors in the nuclear lysate of disease or phenotype-relevant cells. For EMSA, the oligonucleotides with or without bound nuclear factors (often TFs) are analyzed by non-denaturing electrophoresis on a tris-borate-EDTA (TBE) polyacrylamide gel. For DAPA, the oligonucleotides are bound to a magnetic column and the nuclear factors that specifically bind the DNA sequence are eluted and analyzed through mass spectrometry or with a reducing sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE) followed by Western blot analysis. This general approach can be widely used to study the function of non-coding genetic variants associated with any disease, trait, or phenotype.
The protracted diagnostic period and variable disease presentation not only complicate diagnosing SLE but also the epidemiologic study of it. Coupled with the remitting and relapsing nature of the disease and the challenges in managing it, clinical research in lupus requires careful attention to study design, control selection, temporality, and many often overlooked issues in the analysis phase. Between "big data" and the impressive advances in the basic sciences, it is tempting to either oversimplify methods to take advantage of "big data" or overcomplicate because the problem itself is complicated. As we revisit the building blocks of epidemiologic research, we will uncover opportunities to move epidemiology and clinical research forward in SLE. Why do we care about effect modification and what is it? Why can we not just adjust for everything that we want to? And perhaps, most importantly, going back to the very beginning and asking ourselves: does this matter? During this talk we will discuss issues relating to case identification methods, potential biases associated with control selection, and return to the basics of epidemiologic research. Although we shall discuss these issues in the context of environmental (nongenetic) factors, these concerns extend across the worlds of observational data analysis, can impact randomized trials, and are relevant for all types of exposures and outcomes.
Microscopic analysis of air samples fails to determine the pollination season overlap between various species with morphologically similar pollen such as members of the Cupressaceae and Poaceae. Previously, we showed it was possible to use PCR to identify Juniperus ashei pollen from air samples. The present study was undertaken to determine if qPCR could be used to determine airborne pollen counts for three Juniperus species and define the pollination season overlap. METHODS: The atmosphere in Tulsa, OK(USA) was monitored with a Burkard sampler and analyzed by microscopy using standard methods. A second Burkard sampler was used from 2013 to 2015 for molecular analysis; 109 samples were tested with species-specific primers and probes designed for J.ashei, J.pinchotii and J.virginiana. Numbers of pollen grains obtained from microscopy were compared with numbers obtained from qPCR by Spearman correlation coefficient. RESULTS: Cupressaceae pollen was detected in the Tulsa atmosphere from October through April. The qPCR counts for total Juniperus pollen showed a significant correlation with the microscope counts, R50.92, p<0.001. Quantitative PCR data showed overlapping pollen seasons. In the fall, data indicated five days (in two years) with both J.pinchotii and J.ashei pollen. Similarly, in January and February eight days (in two years) indicated both J.ashei and J.virginiana pollen in the air. CONCLUSIONS: This approach is a rapid method to identify and quantify specific pollen types and the pollen season overlap where species and genera cannot be distinguished by microscopy. Defining the exact pollination season will be a benefit for patients sensitive to pollen from specific taxa. 912 11q13 Is an Allergic Risk-Locus That Increases Eoe Risk and Increases LRRC32 Expression
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.