Predicting individual quantitative trait phenotypes from high-resolution genomic polymorphism data is important for personalized medicine in humans, plant and animal breeding, and adaptive evolution. However, this is difficult for populations of unrelated individuals when the number of causal variants is low relative to the total number of polymorphisms and causal variants individually have small effects on the traits. We hypothesized that mapping molecular polymorphisms to genomic features such as genes and their gene ontology categories could increase the accuracy of genomic prediction models. We developed a genomic feature best linear unbiased prediction (GFBLUP) model that implements this strategy and applied it to three quantitative traits (startle response, starvation resistance, and chill coma recovery) in the unrelated, sequenced inbred lines of the Drosophila melanogaster Genetic Reference Panel. Our results indicate that subsetting markers based on genomic features increases the predictive ability relative to the standard genomic best linear unbiased prediction (GBLUP) model. Both models use all markers, but GFBLUP allows differential weighting of the individual genetic marker relationships, whereas GBLUP weighs the genetic marker relationships equally. Simulation studies show that it is possible to further increase the accuracy of genomic prediction for complex traits using this model, provided the genomic features are enriched for causal variants. Our GFBLUP model using prior information on genomic features enriched for causal variants can increase the accuracy of genomic predictions in populations of unrelated individuals and provides a formal statistical framework for leveraging and evaluating information across multiple experimental studies to provide novel insights into the genetic architecture of complex traits.
BackgroundBovine mastitis is one of the most costly and prevalent diseases affecting dairy cows worldwide. In order to develop new strategies to prevent Escherichia coli-induced mastitis, a detailed understanding of the molecular mechanisms underlying the host immune response to an E. coli infection is necessary. To this end, we performed a global gene-expression analysis of mammary gland tissue collected from dairy cows that had been exposed to a controlled E. coli infection. Biopsy samples of healthy and infected utter tissue were collected at T = 24 h post-infection (p.i.) and at T = 192 h p.i. to represent the acute phase response (APR) and chronic stage, respectively. Differentially expressed (DE) genes for each stage were analyzed and the DE genes detected at T = 24 h were also compared to data collected from two previous E. coli mastitis studies that were carried out on post mortem tissue.ResultsNine-hundred-eighty-two transcripts were found to be differentially expressed in infected tissue at T = 24 (P < 0.05). Up-regulated transcripts (699) were largely associated with immune response functions, while the down-regulated transcripts (229) were principally involved in fat metabolism. At T = 192 h, all of the up-regulated transcripts were associated with tissue healing processes. Comparison of T = 24 h DE genes detected in the three E. coli mastitis studies revealed 248 were common and mainly involved immune response functions. KEGG pathway analysis indicated that these genes were involved in 12 pathways related to the pro-inflammatory response and APR, but also identified significant representation of two unexpected pathways: natural killer cell-mediated cytotoxicity pathway (KEGG04650) and the Rig-I-like receptor signalling pathway (KEGG04622).ConclusionsIn E. coli-induced mastitis, infected mammary gland tissue was found to significantly up-regulate expression of genes related to the immune response and down-regulate genes related to fat metabolism. Up to 25% of the DE immune response genes common to the three E. coli mastitis studies at T = 24 h were independent of E. coli strain and dose, cow lactation stage and number, tissue collection method and gene analysis method used. Hence, these DE genes likely represent important mediators of the local APR against E. coli in the mammary gland.
Genomic selection uses genome-wide marker information to predict breeding values for traits of economic interest, and is more accurate than pedigree-based methods. The development of high density SNP arrays for Atlantic salmon has enabled genomic selection in selective breeding programs, alongside high-resolution association mapping of the genetic basis of complex traits. However, in sibling testing schemes typical of salmon breeding programs, trait records are available on many thousands of fish with close relationships to the selection candidates. Therefore, routine high density SNP genotyping may be prohibitively expensive. One means to reducing genotyping cost is the use of genotype imputation, where selected key animals (e.g., breeding program parents) are genotyped at high density, and the majority of individuals (e.g., performance tested fish and selection candidates) are genotyped at much lower density, followed by imputation to high density. The main objectives of the current study were to assess the feasibility and accuracy of genotype imputation in the context of a salmon breeding program. The specific aims were: (i) to measure the accuracy of genotype imputation using medium (25 K) and high (78 K) density mapped SNP panels, by masking varying proportions of the genotypes and assessing the correlation between the imputed genotypes and the true genotypes; and (ii) to assess the efficacy of imputed genotype data in genomic prediction of key performance traits (sea lice resistance and body weight). Imputation accuracies of up to 0.90 were observed using the simple two-generation pedigree dataset, and moderately high accuracy (0.83) was possible even with very low density SNP data (250 SNPs). The performance of genomic prediction using imputed genotype data was comparable to using true genotype data, and both were superior to pedigree-based prediction. These results demonstrate that the genotype imputation approach used in this study can provide a cost-effective method for generating robust genome-wide SNP data for genomic prediction in Atlantic salmon. Genotype imputation approaches are likely to form a critical component of cost-efficient genomic selection programs to improve economically important traits in aquaculture.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.