Differences in cells’ functions arise from differential activity of regulatory elements, including enhancers. Enhancers are cis-regulatory elements that cooperate with promoters through transcription factors to activate the expression of one or several genes by getting physically close to them in the 3D space of the nucleus. There is increasing evidence that genetic variants associated with common diseases are enriched in enhancers active in cell types relevant to these diseases. Identifying the enhancers associated with genes and conversely, the sets of genes activated by each enhancer (the so-called enhancer/gene or E/G relationships) across cell types, can help understanding the genetic mechanisms underlying human diseases. There are three broad approaches for the genome-wide identification of E/G relationships in a cell type: 1) genetic link methods or eQTL, 2) functional link methods based on 1D functional data such as open chromatin, histone mark or gene expression and 3) spatial link methods based on 3D data such as HiC. Since 1) and 3) are costly, the current strategy is to develop functional link methods and to use data from 1) and 3) as reference to evaluate them. However, there is still no consensus on the best functional link method to date, and method comparison remain seldom. Here, we compared the relative performances of three recent methods for the identification of enhancer-gene links, TargetFinder, Average-Rank, and the ABC model, using the three latest benchmarks from the field: a reference that combines 3D and eQTL data, called BENGI, and two genetic screening references, called CRiFF and CRiSPRi. Overall, none of the three methods performed best on the three references. CRiFF and CRISPRi reference sets are likely more reliable, but CRiFF is not genome-wide and CRiFF and CRISPRi are mostly available on the K562 cancer cell line. The BENGI reference set is genome-wide but likely contains many false positives. This study therefore calls for new reliable and genome-wide E/G reference data rather than new functional link E/G identification methods.
BackgroundIntensive selection of modern pig breeds resulted in genetic improvement of productive traits while local pig breeds remained less performant. As they have been bred in extensive systems, they have adapted to specifical environmental conditions resulting in a rich genotypic and phenotypic diversity. In this study, European local pig breeds were genotypically and phenotypically characterised using DNA-pool sequencing data and breed level phenotypes related to stature, fatness, growth and reproductive performance traits. These data were analysed using a dedicated approach to detect selection signatures linked to phenotypic traits in order to uncover potential candidate genes that may be under adaptation to specific environments.ResultsGenetic data analysis of European pig breeds revealed four main axes of genetic variation represented by Iberian and modern breeds (i.e. Large White, Landrace, and Duroc). In addition, breeds clustered according to their geographical origin, for example French Gascon and Basque breeds, Italian Apulo Calabrese and Casertana breeds, Spanish Iberian and Portuguese Alentejano breeds. Principal component analysis of phenotypic data distinguished between larger and leaner breeds with better growth potential and reproductive performance on one hand and breeds that were smaller, fatter, and had low growth and reproductive efficiency on the other hand. Linking selection signatures with phenotype identified 561 significant genomic regions. Among them, several regions contained candidate genes with possible biological effect on stature, fatness, growth and reproduction performance traits. For example, strong associations were found for stature in two regions containing ANXA4 and ANTXR1 genes, for growth performance containing TLL1 gene, for fatness containing DNMT3A and POMC genes and for reproductive performance containing HSD17B7 gene.ConclusionsThe present study on European local pig breeds used a newly developed approach for searching selection signatures supported by phenotypic data at the breed level to identify potential candidate genes that may have adapted to different living environments and production systems. Results can be useful to define conservation programs of local pig breeds.
Differences in cells' functions arise from differential action of regulatory elements, in particular enhancers. Like promoters, enhancers are genomic regions bound by transcription factors (TF) that activate the expression of one or several genes by getting physically close to them in the 3D space of the nucleus. As there is increasing evidence that variants associated with common diseases are located in enhancers active in cell types relevant to these diseases, knowing the set of enhancers and more importantly the sets of genes activated by each enhancer (the so-called enhancer/gene or E/G relationships) in a cell type, will certainly help understanding these diseases. There are three broad approaches for the genome-wide identification of E/G relationships in a cell type: (1) genetic link methods or eQTL, (2) functional link methods based on 1D functional data such as open chromatin, histone mark and gene expression and (3) spatial link methods based on 3D data such as HiC. Since (1) and (3) are costly, there has been a focus on developing functional link methods and using data from (1) and (3) to evaluate them, however there is still no consensus on the best functional link method to date. For this reason we decided to start from the two latest benchmarks of the field, namely from the CRISPRi-FlowFISH (CRiFF) technique and from 3D and eQTL data in BENGI, and to evaluate the two methods claimed to be the best one on each of these benchmark studies, namely the ABC model and the Average-Rank method respectively, on the other method's reference data. Not only did we manage to reproduce the results of the two benchmarks but we also saw that none of the two methods performed best on the two reference data. While CRiFF reference data are very reliable, it is not genome-wide and is mostly available on a cancer cell type. On the other hand BENGI is genome-wide but may contain many false positives. This study therefore calls for new reliable and genome-wide E/G reference data rather than new functional link E/G identification methods.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.