Animal breed identification has wide and important application prospects in the field of genetic breeding. It not only provides effective genetic information for the selection and breeding of superior animals (Behl et al., 2006), but also provides new methods for the traceability of animal products (Dalvit et al., 2007). Meanwhile, it plays a vital role in biological science research (Yaro et al., 2017), pedigree identification (Dreger et al., 2016) and breed resource conservation (Weigend et al., 2004).The earliest breed identification was mainly carried out in a morphological manner (Ceccobelli et al., 2016).
Background
Compared to medium-density single nucleotide polymorphism (SNP) data, high-density SNP data contain abundant genetic variants and provide more information for the genetic evaluation of livestock, but it has been shown that they do not confer any advantage for genomic prediction and heritability estimation. One possible reason is the uneven distribution of the linkage disequilibrium (LD) along the genome, i.e., LD heterogeneity among regions. The aim of this study was to effectively use genome-wide SNP data for genomic prediction and heritability estimation by using models that control LD heterogeneity among regions.
Methods
The LD-adjusted kinship (LDAK) and LD-stratified multicomponent (LDS) models were used to control LD heterogeneity among regions and were compared with the classical model that has no such control. Simulated and real traits of 2000 dairy cattle individuals with imputed high-density (770K) SNP data were used. Five types of phenotypes were simulated, which were controlled by very strongly, strongly, moderately, weakly and very weakly tagged causal variants, respectively. The performances of the models with high- and medium-density (50K) panels were compared to verify that the models that controlled LD heterogeneity among regions were more effective with high-density data.
Results
Compared to the medium-density panel, the use of the high-density panel did not improve and even decreased prediction accuracies and heritability estimates from the classical model for both simulated and real traits. Compared to the classical model, LDS effectively improved the accuracy of genomic predictions and unbiasedness of heritability estimates, regardless of the genetic architecture of the trait. LDAK applies only to traits that are mainly controlled by weakly tagged causal variants, but is still less effective than LDS for this type of trait. Compared with the classical model, LDS improved prediction accuracy by about 13% for simulated phenotypes and by 0.3 to ~ 10.7% for real traits with the high-density panel, and by ~ 1% for simulated phenotypes and by − 0.1 to ~ 6.9% for real traits with the medium-density panel.
Conclusions
Grouping SNPs based on regional LD to construct the LD-stratified multicomponent model can effectively eliminate the adverse effects of LD heterogeneity among regions, and greatly improve the efficiency of high-density SNP data for genomic prediction and heritability estimation.
The Farm animal Genotype-Tissue Expression (FarmGTEx, https://www.farmgtex.org/) project has been established to develop a comprehensive public resource of genetic regulatory variants in domestic animal species, which is essential for linking genetic polymorphisms to variation in phenotypes, helping fundamental biology discovery and exploitation in animal breeding and human biomedicine. Here we present results from the pilot phase of PigGTEx (http://piggtex.farmgtex.org/), where we processed 9,530 RNA-sequencing and 1,602 whole-genome sequencing samples from pigs. We build a pig genotype imputation panel, characterize the transcriptional landscape across over 100 tissues, and associate millions of genetic variants with five types of transcriptomic phenotypes in 34 tissues. We study interactions between genotype and breed/cell type, evaluate tissue specificity of regulatory effects, and elucidate the molecular mechanisms of their action using multi-omics data. Leveraging this resource, we decipher regulatory mechanisms underlying about 80% of the genetic associations for 207 pig complex phenotypes, and demonstrate the similarity of pigs to humans in gene expression and the genetic regulation behind complex phenotypes, corroborating the importance of pigs as a human biomedical model.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.