Abstract:Chicken is a valuable model for understanding fundamental biology, vertebrate evolution and diseases, as well as a major source of nutrient-dense and lean-protein-enriched food globally. Although it is the first non-mammalian amniote genome to be sequenced, the chicken genome still lacks a systematic characterization of functional impacts of genetic variants. Here, through integrating 7,015 RNA-Seq and 2,869 whole-genome sequence data, the Chicken Genotype-Tissue Expression (ChickenGTEx) project presents the p… Show more
“…However, a strong depletion of non-coding variants in typical RNA-seq datasets results in a less reliable imputation of variants that are distant to transcribed regions. Similar observations have been made in chicken (11), pig (12), and human (13).…”
Section: Introductionsupporting
confidence: 87%
“…Livestock GTEx consortia rely on RNA-seq for variant calling (e.g., cattle (8), pig (12), and chicken (11)) to enable molecular QTL mapping as most RNA-sequenced samples don’t have matched DNA-based genotypes or sequences. This is different to the equivalent human GTEx (13) which uses transcriptomes that have matched DNA whole-genome sequencing.…”
Section: Discussionmentioning
confidence: 99%
“…Furthermore, the low agreement we observed for the top associated variants between DNA-seq and RNA-seq variants would weaken downstream analyses like colocalization of putative causal variants (4) if depending only on RNA-seq variants. Livestock GTEx consortia rely on RNA-seq for variant calling (e.g., cattle (8), pig (12), and chicken (11)) to enable molecular QTL mapping as most RNA-sequenced samples don't have matched DNA-based genotypes or sequences. This is different to the equivalent human GTEx (13) which uses transcriptomes that have matched DNA whole-genome sequencing.…”
Association testing between molecular phenotypes and genomic variants can help to understand how genotype affects phenotype. RNA sequencing provides access to molecular phenotypes such as gene expression and alternative splicing while DNA sequencing or microarray genotyping are the prevailing options to obtain genomic variants. Here we genotype variants for 74 male Braunvieh cattle from both DNA and deep total RNA sequencing from three tissues. We show that RNA sequencing calls approximately 40% of variants (7-10 million) called from DNA sequencing, with over 80% precision, rising to over 92% of variants called with nearly 98% precision in highly expressed coding regions. Allele-specific expression and putative post-transcriptional modifications negatively impact variant genotyping accuracy from RNA sequencing and contribute to RNA-DNA differences. Variants called from RNA sequencing detect roughly 75% of eGenes identified using variants called from DNA sequencing, demonstrating a nearly 2-fold enrichment of eQTL variants. We observe a moderate-to-strong correlation in nominal association p-values (Spearman ρ2~0.6), although only 9% of eGenes have the same top associated variant. We also find several highly significant RNA variant-only eQTL, demonstrating that caution must be exercised beyond filtering for variant quality or imputation accuracy when analysing or imputing variants called from RNA sequencing.
“…However, a strong depletion of non-coding variants in typical RNA-seq datasets results in a less reliable imputation of variants that are distant to transcribed regions. Similar observations have been made in chicken (11), pig (12), and human (13).…”
Section: Introductionsupporting
confidence: 87%
“…Livestock GTEx consortia rely on RNA-seq for variant calling (e.g., cattle (8), pig (12), and chicken (11)) to enable molecular QTL mapping as most RNA-sequenced samples don’t have matched DNA-based genotypes or sequences. This is different to the equivalent human GTEx (13) which uses transcriptomes that have matched DNA whole-genome sequencing.…”
Section: Discussionmentioning
confidence: 99%
“…Furthermore, the low agreement we observed for the top associated variants between DNA-seq and RNA-seq variants would weaken downstream analyses like colocalization of putative causal variants (4) if depending only on RNA-seq variants. Livestock GTEx consortia rely on RNA-seq for variant calling (e.g., cattle (8), pig (12), and chicken (11)) to enable molecular QTL mapping as most RNA-sequenced samples don't have matched DNA-based genotypes or sequences. This is different to the equivalent human GTEx (13) which uses transcriptomes that have matched DNA whole-genome sequencing.…”
Association testing between molecular phenotypes and genomic variants can help to understand how genotype affects phenotype. RNA sequencing provides access to molecular phenotypes such as gene expression and alternative splicing while DNA sequencing or microarray genotyping are the prevailing options to obtain genomic variants. Here we genotype variants for 74 male Braunvieh cattle from both DNA and deep total RNA sequencing from three tissues. We show that RNA sequencing calls approximately 40% of variants (7-10 million) called from DNA sequencing, with over 80% precision, rising to over 92% of variants called with nearly 98% precision in highly expressed coding regions. Allele-specific expression and putative post-transcriptional modifications negatively impact variant genotyping accuracy from RNA sequencing and contribute to RNA-DNA differences. Variants called from RNA sequencing detect roughly 75% of eGenes identified using variants called from DNA sequencing, demonstrating a nearly 2-fold enrichment of eQTL variants. We observe a moderate-to-strong correlation in nominal association p-values (Spearman ρ2~0.6), although only 9% of eGenes have the same top associated variant. We also find several highly significant RNA variant-only eQTL, demonstrating that caution must be exercised beyond filtering for variant quality or imputation accuracy when analysing or imputing variants called from RNA sequencing.
“…This extensive collection will serve as a valuable genetic database for studying the evolution and improvement of chickens and other birds. Additionally, our project will collaborate with the FAANG project (43) and gradually integrate the GCRP with regulatory elements atlas (44), and ChickenGTEx (45). This integration can then be applied to meta-and colocalization analyses (46,47), facilitating a more comprehensive analysis of gene regulatory mechanisms.…”
Chickens are a crucial source of protein for humans and a popular model animal for bird research. Despite the emergence of imputation as a reliable genotyping strategy for large populations, the lack of a high-quality chicken reference panel has hindered progress in chicken genome research. To address this issue, here we introduce the first phase of the 100 K Global Chicken Reference Panel Project (100 K GCRPP). The project includes 13,187 samples and provides services for varied applications on its website (http://farmrefpanel.com/GCRP/). Currently, two panels are available: a Comprehensive Mix Panel (CMP) for domestication diversity research and a Commercial Breed Panel (CBP) for breeding broilers specifically. Evaluation of genotype imputation quality showed that CMP had the highest imputation accuracy compared to imputation using existing chicken panel in animal SNPAtlas, whereas CBP performed stably in the imputation of commercial populations. Additionally, we found that genome-wide association studies using GCRP-imputed data, whether on simulated or real phenotypes, exhibited greater statistical power. In conclusion, our study indicates that the GCRP effectively fills the gap in high-quality reference panels for chickens, providing an effective imputation platform for future genetic and breeding research.
“…For instance, AQUA-FAANG relied on nf-core pipelines to exploit the evolutionary conservation of genomic regulatory elements and epigenetic states across six farmed fish species for multiple biological conditions and matched tissue panels 29 . Data integration across projects is critical, as projects such as the farm animal GTEx [30][31][32] are expected to keep integrating a larger number of species and deeper sequencing over time. Interoperability, achieved through the use of a common analytic framework is therefore an essential component of long-term sustainability, well beyond the goal and lifetime of projects like EuroFAANG.…”
Section: Uptake Of Nf-core By the Farmed Animals Genomics Communitymentioning
Standardised analysis pipelines are an important part of FAIR bioinformatics research. Over the last decade, there has been a notable shift from point-and-click pipeline solutions such as Galaxy towards command-line solutions such as Nextflow and Snakemake. We report on recent developments in the nf-core and Nextflow frameworks that have led to widespread adoption across many scientific communities. We describe how adopting nf-core standards enables faster development, improved interoperability, and collaboration with the >8,000 members of the nf-core community. The recent development of Nextflow Domain-Specific Language 2 (DSL2) allows pipeline components to be shared and combined across projects. The nf-core community has harnessed this with a library of modules and subworkflows that can be integrated into any Nextflow pipeline, enabling research communities to progressively transition to nf-core best practices. We present a case study of nf-core adoption by six European research consortia, grouped under the EuroFAANG umbrella and dedicated to farmed animal genomics. We believe that the process outlined in this report can inspire many large consortia to seek harmonisation of their data analysis procedures.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.