Background: One of the main limitations of many livestock breeding programs is that selection is in pure breeds housed in high-health environments but the aim is to improve crossbred performance under field conditions. Genomic selection (GS) using high-density genotyping could be used to address this. However in crossbred populations, 1) effects of SNPs may be breed specific, and 2) linkage disequilibrium may not be restricted to markers that are tightly linked to the QTL. In this study we apply GS to select for commercial crossbred performance and compare a model with breed-specific effects of SNP alleles (BSAM) to a model where SNP effects are assumed the same across breeds (ASGM). The impact of breed relatedness (generations since separation), size of the population used for training, and marker density were evaluated. Trait phenotype was controlled by 30 QTL and had a heritability of 0.30 for crossbred individuals. A Bayesian method (Bayes-B) was used to estimate the SNP effects in the crossbred training population and the accuracy of resulting GS breeding values for commercial crossbred performance was validated in the purebred population.
In livestock, genomic selection (GS) has primarily been investigated by simulation of purebred populations. Traits of interest are, however, often measured in crossbred or mixed populations with uncertain breed composition. If such data are used as the training data for GS without accounting for breed composition, estimates of marker effects may be biased due to population stratification and admixture. To investigate this, a genome of 100 cM was simulated with varying marker densities (5 to 40 segregating markers per cM). After 1,000 generations of random mating in a population of effective size 500, 4 lines with effective size 100 were isolated and mated for another 50 generations to create 4 pure breeds. These breeds were used to generate combined, F(1), F(2), 3- and 4-way crosses, and admixed training data sets of 1,000 individuals with phenotypes for an additive trait controlled by 100 segregating QTL and heritability of 0.30. The validation data set was a sample of 1,000 genotyped individuals from one pure breed. Method Bayes-B was used to simultaneously estimate the effects of all markers for breeding value estimation. With 5 (40) markers per cM, the correlation of true with estimated breeding value of selection candidates (accuracy) was greatest, 0.79 (0.85), when data from the same pure breed were used for training. When the training data set consisted of crossbreds, the accuracy ranged from 0.66 (0.79) to 0.74 (0.83) for the 2 marker densities, respectively. The admixed training data set resulted in nearly the same accuracies as when training was in the breed to which selection candidates belonged. However, accuracy was greatly reduced when genes from the target pure breed were not included in the admixed or crossbred population. This implies that, with high-density markers, admixed and crossbred populations can be used to develop GS prediction equations for all pure breeds that contributed to the population, without a substantial loss of accuracy compared with training on purebred data, even if breed origin has not been explicitly taken into account. In addition, using GS based on high-density marker data, purebreds can be accurately selected for crossbred performance without the need for pedigree or breed information. Results also showed that haplotype segments with strong linkage disequilibrium are shorter in crossbred and admixed populations than in purebreds, providing opportunities for QTL fine mapping.
Data that are collected for whole-genome prediction can also be used for genome-wide association studies (GWAS). This paper discusses how Bayesian multiple-regression methods that are used for whole-genome prediction can be adapted for GWAS. It is argued here that controlling the posterior type I error rate (PER) is more suitable than controlling the genomewise error rate (GER) for controlling false positives in GWAS. It is shown here that under ideal conditions, i.e., when the model is correctly specified, PER can be controlled by using Bayesian posterior probabilities that are easy to obtain. Computer simulation was used to examine the properties of this Bayesian approach when the ideal conditions were not met. Results indicate that even then useful inferences can be made.
BackgroundGenomic selection is an appealing method to select purebreds for crossbred performance. In the case of crossbred records, single nucleotide polymorphism (SNP) effects can be estimated using an additive model or a breed-specific allele model. In most studies, additive gene action is assumed. However, dominance is the likely genetic basis of heterosis. Advantages of incorporating dominance in genomic selection were investigated in a two-way crossbreeding program for a trait with different magnitudes of dominance. Training was carried out only once in the simulation.ResultsWhen the dominance variance and heterosis were large and overdominance was present, a dominance model including both additive and dominance SNP effects gave substantially greater cumulative response to selection than the additive model. Extra response was the result of an increase in heterosis but at a cost of reduced purebred performance. When the dominance variance and heterosis were realistic but with overdominance, the advantage of the dominance model decreased but was still significant. When overdominance was absent, the dominance model was slightly favored over the additive model, but the difference in response between the models increased as the number of quantitative trait loci increased. This reveals the importance of exploiting dominance even in the absence of overdominance. When there was no dominance, response to selection for the dominance model was as high as for the additive model, indicating robustness of the dominance model. The breed-specific allele model was inferior to the dominance model in all cases and to the additive model except when the dominance variance and heterosis were large and with overdominance. However, the advantage of the dominance model over the breed-specific allele model may decrease as differences in linkage disequilibrium between the breeds increase. Retraining is expected to reduce the advantage of the dominance model over the alternatives, because in general, the advantage becomes important only after five or six generations post-training.ConclusionUnder dominance and without retraining, genomic selection based on the dominance model is superior to the additive model and the breed-specific allele model to maximize crossbred performance through purebred selection.
Background: Genomic selection is an appealing method to select purebreds for crossbred performance. In the case of crossbred records, single nucleotide polymorphism (SNP) effects can be estimated using an additive model or a breed-specific allele model. In most studies, additive gene action is assumed. However, dominance is the likely genetic basis of heterosis. Advantages of incorporating dominance in genomic selection were investigated in a two-way crossbreeding program for a trait with different magnitudes of dominance. Training was carried out only once in the simulation. Results: When the dominance variance and heterosis were large and overdominance was present, a dominance model including both additive and dominance SNP effects gave substantially greater cumulative response to selection than the additive model. Extra response was the result of an increase in heterosis but at a cost of reduced purebred performance. When the dominance variance and heterosis were realistic but with overdominance, the advantage of the dominance model decreased but was still significant. When overdominance was absent, the dominance model was slightly favored over the additive model, but the difference in response between the models increased as the number of quantitative trait loci increased. This reveals the importance of exploiting dominance even in the absence of overdominance. When there was no dominance, response to selection for the dominance model was as high as for the additive model, indicating robustness of the dominance model. The breed-specific allele model was inferior to the dominance model in all cases and to the additive model except when the dominance variance and heterosis were large and with overdominance. However, the advantage of the dominance model over the breed-specific allele model may decrease as differences in linkage disequilibrium between the breeds increase. Retraining is expected to reduce the advantage of the dominance model over the alternatives, because in general, the advantage becomes important only after five or six generations post-training. Conclusion: Under dominance and without retraining, genomic selection based on the dominance model is superior to the additive model and the breed-specific allele model to maximize crossbred performance through purebred selection. Background Numerous studies have shown encouraging results of applying genomic selection (GS) in purebred populations [1-6]. However, except for dairy cattle, most animals used in livestock production systems are crossbreds, with advantages of heterosis and breed complementarity. For
BackgroundPopulation stratification and cryptic relationships have been the main sources of excessive false-positives and false-negatives in population-based association studies. Many methods have been developed to model these confounding factors and minimize their impact on the results of genome-wide association studies. In most of these methods, a two-stage approach is applied where: (1) methods are used to determine if there is a population structure in the sample dataset and (2) the effects of population structure are corrected either by modeling it or by running a separate analysis within each sub-population. The objective of this study was to evaluate the impact of population structure on the accuracy and power of genome-wide association studies using a Bayesian multiple regression method.MethodsWe conducted a genome-wide association study in a stochastically simulated admixed population. The genome was composed of six chromosomes, each with 1000 markers. Fifteen segregating quantitative trait loci contributed to the genetic variation of a quantitative trait with heritability of 0.30. The impact of genetic relationships and breed composition (BC) on three analysis methods were evaluated: single marker simple regression (SMR), single marker mixed linear model (MLM) and Bayesian multiple-regression analysis (BMR). Each method was fitted with and without BC. Accuracy, power, false-positive rate and the positive predictive value of each method were calculated and used for comparison.ResultsSMR and BMR, both without BC, were ranked as the worst and the best performing approaches, respectively. Our results showed that, while explicit modeling of genetic relationships and BC is essential for models SMR and MLM, BMR can disregard them and yet result in a higher power without compromising its false-positive rate.ConclusionsThis study showed that the Bayesian multiple-regression analysis is robust to population structure and to relationships among study subjects and performs better than a single marker mixed linear model approach.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.