JAC) 13 14 15 ¶ These authors contributed equally to this work. 16 17 58 59 60 Key words 61 gene regulatory enhancer, evolutionary conservation, mammals, machine learning, support 62 vector machines, convolutional neural networks, deep learning, regulatory code 63 64 4 130 131 Results 132 Enhancers can be predicted from short DNA sequence patterns in mammals 133 Genome-wide enhancer activity across many mammalian species was recently assayed by profiling 134 enhancer-associated histone modifications in the adult liver [12], developing limb [8] and developing 135 brain [46]. Certain chemical modifications to histones, such as acetylation of lysine 27 of histone H3 136 (H3K27ac) and lack of trimethylation of lysine 4 of H3 (H3K4me3), are significantly associated with 137 active enhancers. Determining the genomic locations of these modifications via ChIP-seq provides a 138 genome-wide proxy for the active enhancer landscape [27,28]. For brevity, we refer to genomic regions 139
Interactions between genetic variants, also called epistasis, are pervasive in model organisms; however, their importance in humans remains unclear because statistical interactions in observational studies can be explained by processes other than biological epistasis. Using statistical modeling, we identified 1,093 interactions between pairs of cis-regulatory variants impacting gene expression in lymphoblastoid cell lines. Factors known to confound these analyses (ceiling/floor effects, population stratification, haplotype effects, or single variants tagged through linkage disequilibrium) explained most of these interactions. However, we found 15 interactions robust to these explanations, and we further show that despite potential confounding, interacting variants were enriched in numerous regulatory regions suggesting potential biological importance. While genetic interactions may not be the true underlying mechanism of all our statistical models, our analyses discover new signals undetected in standard single-marker analyses. Ultimately, we identified new complex genetic architectures regulating 23 genes, suggesting that single-variant analyses may miss important modifiers.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.