Simon Coetzee scite author profile

Summary: Functional annotation represents a key step toward the understanding and interpretation of germline and somatic variation as revealed by genome-wide association studies (GWAS) and The Cancer Genome Atlas (TCGA), respectively. GWAS have revealed numerous genetic risk variants residing in non-coding DNA associated with complex diseases. For sequences that lie within enhancers or promoters of transcription, it is not straightforward to assess the effects of variants on likely transcription factor binding sites. Consequently we introduce motifbreakR, which allows the biologist to judge whether the sequence surrounding a polymorphism or mutation is a good match, and how much information is gained or lost in one allele of the polymorphism or mutation relative to the other. MotifbreakR is flexible, giving a choice of algorithms for interrogation of genomes with motifs from many public sources that users can choose from. MotifbreakR can predict effects for novel or previously described variants in public databases, making it suitable for tasks beyond the scope of its original design. Lastly, it can be used to interrogate any genome curated within bioconductor.Availability and implementation: https://github.com/Simon-Coetzee/MotifBreakR, www.bioconductor.org.Contact: dennis.hazelett@cshs.org

show abstract

Comprehensive Functional Annotation of 77 Prostate Cancer Risk Loci

Hazelett

et al. 2014

View full text Add to dashboard Cite

Genome-wide association studies (GWAS) have revolutionized the field of cancer genetics, but the causal links between increased genetic risk and onset/progression of disease processes remain to be identified. Here we report the first step in such an endeavor for prostate cancer. We provide a comprehensive annotation of the 77 known risk loci, based upon highly correlated variants in biologically relevant chromatin annotations— we identified 727 such potentially functional SNPs. We also provide a detailed account of possible protein disruption, microRNA target sequence disruption and regulatory response element disruption of all correlated SNPs at . 88% of the 727 SNPs fall within putative enhancers, and many alter critical residues in the response elements of transcription factors known to be involved in prostate biology. We define as risk enhancers those regions with enhancer chromatin biofeatures in prostate-derived cell lines with prostate-cancer correlated SNPs. To aid the identification of these enhancers, we performed genomewide ChIP-seq for H3K27-acetylation, a mark of actively engaged enhancers, as well as the transcription factor TCF7L2. We analyzed in depth three variants in risk enhancers, two of which show significantly altered androgen sensitivity in LNCaP cells. This includes rs4907792, that is in linkage disequilibrium () with an eQTL for NUDT11 (on the X chromosome) in prostate tissue, and rs10486567, the index SNP in intron 3 of the JAZF1 gene on chromosome 7. Rs4907792 is within a critical residue of a strong consensus androgen response element that is interrupted in the protective allele, resulting in a 56% decrease in its androgen sensitivity, whereas rs10486567 affects both NKX3-1 and FOXA-AR motifs where the risk allele results in a 39% increase in basal activity and a 28% fold-increase in androgen stimulated enhancer activity. Identification of such enhancer variants and their potential target genes represents a preliminary step in connecting risk to disease process.

show abstract

ELMER v.2: an R/Bioconductor package to reconstruct gene regulatory networks from DNA methylation and transcriptome profiles

Silva

Coetzee

Gull

et al. 2018

View full text Add to dashboard Cite

Motivation DNA methylation has been used to identify functional changes at transcriptional enhancers and other cis-regulatory modules (CRMs) in tumors and other disease tissues. Our R/Bioconductor package ELMER (Enhancer Linking by Methylation/Expression Relationships) provides a systematic approach that reconstructs altered gene regulatory networks (GRNs) by combining enhancer methylation and gene expression data derived from the same sample set. Results We present a completely revised version 2 of ELMER that provides numerous new features including an optional web-based interface and a new Supervised Analysis mode to use pre-defined sample groupings. We show that Supervised mode significantly increases statistical power and identifies additional GRNs and associated Master Regulators, such as SOX11 and KLF5 in Basal-like breast cancer. Availability and implementation ELMER v.2 is available as an R/Bioconductor package at http://bioconductor.org/packages/ELMER/. Supplementary information Supplementary data are available at Bioinformatics online.

show abstract

FunciSNP: an R/bioconductor tool integrating functional non-coding data sets with genetic association studies to identify candidate regulatory SNPs

Coetzee

Rhie

Berman

et al. 2012

View full text Add to dashboard Cite

Single nucleotide polymorphisms (SNPs) are increasingly used to tag genetic loci associated with phenotypes such as risk of complex diseases. Technically, this is done genome-wide without prior restriction or knowledge of biological feasibility in scans referred to as genome-wide association studies (GWAS). Depending on the linkage disequilibrium (LD) structure at a particular locus, such tagSNPs may be surrogates for many thousands of other SNPs, and it is difficult to distinguish those that may play a functional role in the phenotype from those simply genetically linked. Because a large proportion of tagSNPs have been identified within non-coding regions of the genome, distinguishing functional from non-functional SNPs has been an even greater challenge. A strategy was recently proposed that prioritizes surrogate SNPs based on non-coding chromatin and epigenomic mapping techniques that have become feasible with the advent of massively parallel sequencing. Here, we introduce an R/Bioconductor software package that enables the identification of candidate functional SNPs by integrating information from tagSNP locations, lists of linked SNPs from the 1000 genomes project and locations of chromatin features which may have functional significance. Availability: FunciSNP is available from Bioconductor (bioconductor.org).

show abstract

Interleukin-6-Related Genotypes, Body Mass Index, and Risk of Multiple Myeloma and Plasmacytoma

Cozen

Gebregziabher

Conti

et al. 2006

View full text Add to dashboard Cite

Interleukin-6 (IL-6) promotes normal plasma cell development and proliferation of myeloma cells in culture. We evaluated IL-6 genotypes and body mass index (BMI) in a case-control study of multiple myeloma and plasmacytoma. DNA samples and questionnaires were obtained from incident cases of multiple myeloma (n = 134) and plasmacytoma (n = 16; plasma cell neoplasms) ascertained from the Los Angeles County population-based cancer registry and from siblings or cousins of cases (family controls, n = 112) and population controls (n = 126). Genotypes evaluated included IL-6 promoter gene single nucleotide polymorphisms (SNP) at positions À174, À572, and À597; one variable number of tandem repeats (À373 A n T n ); and one SNP in the IL-6 receptor (IL-6ra) gene at position À358. The variant allele of the IL-6 promoter SNP À572 was associated with a roughly 2-fold increased risk of plasma cell neoplasms when cases were compared with family [odds ratio (OR), 1.8; 95% confidence interval (95% CI), 0.7-4.7] or population controls (OR, 2.4; 95% CI, 1.2-4.7). The À373 9A/9A genotype was associated with a decreased risk compared with the most common genotype (OR for cases versus family controls, 0.4; 95% CI, 0.1-1.7; OR for cases versus population controls, 0.3; 95% CI, 0.1-0.9). No other SNPs were associated with risk. Obesity (BMI z 30 kg/m 2 ) increased risk nonsignificantly by 40% and 80% when cases were compared with family controls or population controls, respectively, relative to persons with a BMI of <25 kg/m 2 . These results suggest that IL-6 promoter genotypes may be associated with increased risk of plasma cell neoplasms. (Cancer Epidemiol Biomarkers Prev 2006;15(11):2285 -91)

show abstract

Nucleosome positioning and histone modifications define relationships between regulatory elements and nearby gene expression in breast epithelial cells

et al. 2014

View full text Add to dashboard Cite

BackgroundThe precise nature of how cell type specific chromatin structures at enhancer sites affect gene expression is largely unknown. Here we identified cell type specific enhancers coupled with gene expression in two different types of breast epithelial cells, HMEC (normal breast epithelial cells) and MDAMB231 (triple negative breast cancer cell line).ResultsEnhancers were defined by modified neighboring histones [using chromatin immunoprecipitation followed by sequencing (ChIP-seq)] and nucleosome depletion [using formaldehyde-assisted isolation of regulatory elements followed by sequencing (FAIRE-seq)]. Histone modifications at enhancers were related to the expression levels of nearby genes up to 750 kb away. These expression levels were correlated with enhancer status (poised or active), defined by surrounding histone marks. Furthermore, about fifty percent of poised and active enhancers contained nucleosome-depleted regions. We also identified response element motifs enriched at these enhancer sites that revealed key transcription factors (e.g. TP63) likely involved in regulating breast epithelial enhancer-mediated gene expression. By utilizing expression data, potential target genes of more than 600 active enhancers were identified. These genes were involved in proteolysis, epidermis development, cell adhesion, mitosis, cell cycle, and DNA replication.ConclusionsThese findings facilitate the understanding of epigenetic regulation specifically, such as the relationships between regulatory elements and gene expression and generally, how breast epithelial cellular phenotypes are determined by cell type specific enhancers.Electronic supplementary materialThe online version of this article (doi:10.1186/1471-2164-15-331) contains supplementary material, which is available to authorized users.

show abstract

Comprehensive Functional Annotation of Seventy-One Breast Cancer Risk Loci

et al. 2013

View full text Add to dashboard Cite

Breast Cancer (BCa) genome-wide association studies revealed allelic frequency differences between cases and controls at index single nucleotide polymorphisms (SNPs). To date, 71 loci have thus been identified and replicated. More than 320,000 SNPs at these loci define BCa risk due to linkage disequilibrium (LD). We propose that BCa risk resides in a subgroup of SNPs that functionally affects breast biology. Such a shortlist will aid in framing hypotheses to prioritize a manageable number of likely disease-causing SNPs. We extracted all the SNPs, residing in 1 Mb windows around breast cancer risk index SNP from the 1000 genomes project to find correlated SNPs. We used FunciSNP, an R/Bioconductor package developed in-house, to identify potentially functional SNPs at 71 risk loci by coinciding them with chromatin biofeatures. We identified 1,005 SNPs in LD with the index SNPs (r2≥0.5) in three categories; 21 in exons of 18 genes, 76 in transcription start site (TSS) regions of 25 genes, and 921 in enhancers. Thirteen SNPs were found in more than one category. We found two correlated and predicted non-benign coding variants (rs8100241 in exon 2 and rs8108174 in exon 3) of the gene, ANKLE1. Most putative functional LD SNPs, however, were found in either epigenetically defined enhancers or in gene TSS regions. Fifty-five percent of these non-coding SNPs are likely functional, since they affect response element (RE) sequences of transcription factors. Functionality of these SNPs was assessed by expression quantitative trait loci (eQTL) analysis and allele-specific enhancer assays. Unbiased analyses of SNPs at BCa risk loci revealed new and overlooked mechanisms that may affect risk of the disease, thereby providing a valuable resource for follow-up studies.

show abstract

Enrichment of risk SNPs in regulatory regions implicate diverse tissues in Parkinson’s disease etiology

Coetzee

Pierce

Brundin

et al. 2016

Sci Rep

View full text Add to dashboard Cite

Recent genome-wide association studies (GWAS) of Parkinson’s disease (PD) revealed at least 26 risk loci, with associated single nucleotide polymorphisms (SNPs) located in non-coding DNA having unknown functions in risk. In order to explore in which cell types these SNPs (and their correlated surrogates at r2 ≥ 0.8) could alter cellular function, we assessed their location overlap with histone modification regions that indicate transcription regulation in 77 diverse cell types. We found statistically significant enrichment of risk SNPs at 12 loci in active enhancers or promoters. We investigated 4 risk loci in depth that were most significantly enriched (−logeP > 14) and contained 8 putative enhancers in the different cell types. These enriched loci, along with eQTL associations, were unexpectedly present in non-neuronal cell types. These included lymphocytes, mesendoderm, liver- and fat-cells, indicating that cell types outside the brain are involved in the genetic predisposition to PD. Annotating regulatory risk regions within specific cell types may unravel new putative risk mechanisms and molecular pathways that contribute to PD development.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Simon Coetzee

motifbreakR: an R/Bioconductor package for predicting variant effects at transcription factor binding sites

Comprehensive Functional Annotation of 77 Prostate Cancer Risk Loci

ELMER v.2: an R/Bioconductor package to reconstruct gene regulatory networks from DNA methylation and transcriptome profiles

FunciSNP: an R/bioconductor tool integrating functional non-coding data sets with genetic association studies to identify candidate regulatory SNPs

Interleukin-6-Related Genotypes, Body Mass Index, and Risk of Multiple Myeloma and Plasmacytoma

Nucleosome positioning and histone modifications define relationships between regulatory elements and nearby gene expression in breast epithelial cells

Comprehensive Functional Annotation of Seventy-One Breast Cancer Risk Loci

Enrichment of risk SNPs in regulatory regions implicate diverse tissues in Parkinson’s disease etiology

Contact Info

Product

Resources

About