Abstract:We propose a novel computational method for genomic selection that combines identical-by-state (IBS)-based Haseman-Elston (HE) regression and best linear prediction (BLP), called HE-BLP. Genomic best linear unbiased prediction (GBLUP) has been widely used in whole-genome prediction for breeding programs. To determine the total genetic variance of a training population, a linear mixed model (LMM) should be solved via restricted maximum likelihood (REML), whose computational complexity is the cube of the sample … Show more
“…Genomic selection (GS) has been widely used to estimate the breeding values in various fields, such as animal and plant breeding programs (Newell and Jannink, 2014; Liu and Chen, 2017; Weller et al, 2017). These breeding programs select their breeding animals or plants based on predicted genomic breeding values (GBVs).…”
Linkage disequilibrium (LD) is a useful parameter for guiding the accuracy and power of both genome-wide association studies (GWAS) and genomic selection (GS) among different livestock species. The present study evaluated the extent of LD, persistence of phase and effective population size (Ne) for the purebred (Mediterranean buffalo; n = 411) and crossbred [Mediterranean × Jianghan × Nili-Ravi buffalo, n = 9; Murrah × Nili-Ravi × local (Xilin or Fuzhong) buffalo, n = 36] buffalo populations using the 90K Buffalo SNP genotyping array. The results showed that the average square of correlation coefficient (r2) between adjacent SNP was 0.13 ± 0.19 across all autosomes for purebred and 0.09 ± 0.13 for crossbred, and the most rapid decline in LD was observed over the first 200 kb. Estimated r2 ≥ 0.2 extended up to ~50 kb in crossbred and 170 kb in purebred populations, while average r2 values ≥0.3 were respectively observed in the ~10 and 60 kb in the crossbred and purebred populations. The largest phase correlation (RP, C = 0.47) was observed at the distance of 100 kb, suggesting that this phase was not actively preserved between the two populations. Estimated Ne for the purebred and crossbred population at the current generation was 387 and 113 individuals, respectively. These findings may provide useful information to guide the GS and GWAS in buffaloes.
“…Genomic selection (GS) has been widely used to estimate the breeding values in various fields, such as animal and plant breeding programs (Newell and Jannink, 2014; Liu and Chen, 2017; Weller et al, 2017). These breeding programs select their breeding animals or plants based on predicted genomic breeding values (GBVs).…”
Linkage disequilibrium (LD) is a useful parameter for guiding the accuracy and power of both genome-wide association studies (GWAS) and genomic selection (GS) among different livestock species. The present study evaluated the extent of LD, persistence of phase and effective population size (Ne) for the purebred (Mediterranean buffalo; n = 411) and crossbred [Mediterranean × Jianghan × Nili-Ravi buffalo, n = 9; Murrah × Nili-Ravi × local (Xilin or Fuzhong) buffalo, n = 36] buffalo populations using the 90K Buffalo SNP genotyping array. The results showed that the average square of correlation coefficient (r2) between adjacent SNP was 0.13 ± 0.19 across all autosomes for purebred and 0.09 ± 0.13 for crossbred, and the most rapid decline in LD was observed over the first 200 kb. Estimated r2 ≥ 0.2 extended up to ~50 kb in crossbred and 170 kb in purebred populations, while average r2 values ≥0.3 were respectively observed in the ~10 and 60 kb in the crossbred and purebred populations. The largest phase correlation (RP, C = 0.47) was observed at the distance of 100 kb, suggesting that this phase was not actively preserved between the two populations. Estimated Ne for the purebred and crossbred population at the current generation was 387 and 113 individuals, respectively. These findings may provide useful information to guide the GS and GWAS in buffaloes.
“…In contrast, REML is a model-based approach and the exact structure of the estimated variance, regardless of additive or dominance, remains elusive. Furthermore, as discussed in our previous study (Liu and Chen, 2017), the computational complex for HE is Oðn 2 Þ, proportional to the square of sample size, but for REML Oðn 3 Þ. The computational advantage of HE is important especially when the sample size is large.…”
Section: Statistical Modelsmentioning
confidence: 94%
“…In our previous study, we developed a fast genomic prediction approach (namely HEBLP, or HEBLP|A herein) combining identical-by-state (IBS)-based Haseman-Elston (HE) regression and best linear prediction (BLP). It can obtain the total additive genetic variance via a simple HE linear regression with reduced computation complexity, but only additive effects are included (Liu and Chen, 2017). The present study aims to develop the HEBLP with both the additive and dominance effects (HEBLP|AD) and to evaluate its predictive performance in the simulated and a real Arabidopsis thaliana F2 population.…”
In our previous work, we proposed a genomic prediction method combing identical-by-state-based Haseman-Elston regression and best linear prediction with additive variance component only (HEBLP|A herein), the most essential component of genetic variation. Since the dominance effects contribute significantly in heterosis, it is desirable to incorporate the HEBLP with dominance variance component that is expected to enhance the predictive accuracy as we move to the further development: HEBLP|AD, a paralleled implementation of genomic prediction compared with genomic best linear unbiased prediction (GBLUP). The simulation results indicated that when the dominance effects contributed to a large proportion of genetic variation, HEBLP|AD and GBLUP|AD, having similar accuracy, both outperformed HEBLP|A; but when the dominance variation was none or little, HEBLP|A, HEBLP|AD, and GBLUP|AD had similar predictability. The analysis of real data from Arabidopsis thaliana F2 population also demonstrated the latter situation. In summary, HEBLP|AD performed stable whether a trait was controlled by dominance effects or not.
“…Two distinct lineages may be easily defined (Fu and Williams 2008;Loskutov 2008)and extant diploids may be assigned unambiguously to either the A or the C branch. The A and C lineage divergence may have occurred from 4-20 myr ago (Liu and Chen 2017;Fu 2018). There is less certainty about the origin of sub-genomes within extant polyploids, although there is now a consensus that a variant lineage of the A genomes, designated as the D, is found with C genome lineages in most extant tetraploids, with one of these DC species subsequently giving rise to today's ADC hexaploids (Yan et al 2016b).…”
Oat (Avena sativa L.), ranking sixth in world cereal production, is primarily produced as a multipurpose crop for grain, pasture, and forage or as a rotation crop in many parts of the world. Recent research has elevated its potential dietary value for human nutrition and health care. Oats are well adapted to a wide range of soil types and can perform on acid soils. World oat production is concentrated between latitudes 35-65º N, and 20 to 46º S. Avena genomes are large and complex, in the range of 4.12 Gb to 12.6 Gb. Oat productivity is affected by many diseases, although crown rust (Puccinia coronataf. sp. avenae) and stem rust (Puccinia graminisf. sp. avenae) are the key diseases worldwide. The focus of this chapter is to review the major developments and their impacts on oat breeding, especially on the challenges posed by climate or environmental changes (biotic and abiotic stresses mainly) for oat cultivation. Next generation breeding tools will help to develop approaches to genetically improve and manipulate oat which would aid significantly in oat enhancement efforts. Although, oat biotechnology has been advanced at a similar pace as the rest of cereals, it lags still behind. More genomic tools, from genomic assisted breeding to genome editing tools are needed to improve the resources to improve oats under climate change in the next few decades.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.