The first national single-step, full-information (phenotype, pedigree, and marker genotype) genetic evaluation was developed for final score of US Holsteins. Data included final scores recorded from 1955 to 2009 for 6,232,548 Holsteins cows. BovineSNP50 (Illumina, San Diego, CA) genotypes from the Cooperative Dairy DNA Repository (Beltsville, MD) were available for 6,508 bulls. Three analyses used a repeatability animal model as currently used for the national US evaluation. The first 2 analyses used final scores recorded up to 2004. The first analysis used only a pedigree-based relationship matrix. The second analysis used a relationship matrix based on both pedigree and genomic information (single-step approach). The third analysis used the complete data set and only the pedigree-based relationship matrix. The fourth analysis used predictions from the first analysis (final scores up to 2004 and only a pedigree-based relationship matrix) and prediction using a genomic based matrix to obtain genetic evaluation (multiple-step approach). Different allele frequencies were tested in construction of the genomic relationship matrix. Coefficients of determination between predictions of young bulls from parent average, single-step, and multiple-step approaches and their 2009 daughter deviations were 0.24, 0.37 to 0.41, and 0.40, respectively. The highest coefficient of determination for a single-step approach was observed when using a genomic relationship matrix with assumed allele frequencies of 0.5. Coefficients for regression of 2009 daughter deviations on parent-average, single-step, and multiple-step predictions were 0.76, 0.68 to 0.79, and 0.86, respectively, which indicated some inflation of predictions. The single-step regression coefficient could be increased up to 0.92 by scaling differences between the genomic and pedigree-based relationship matrices with little loss in accuracy of prediction. One complete evaluation took about 2h of computing time and 2.7 gigabytes of memory. Computing times for single-step analyses were slightly longer (2%) than for pedigree-based analysis. A national single-step genetic evaluation with the pedigree relationship matrix augmented with genomic information provided genomic predictions with accuracy and bias comparable to multiple-step procedures and could account for any population or data structure. Advantages of single-step evaluations should increase in the future when animals are pre-selected on genotypes.
Genetic parameters were estimated simultaneously for 5 herd-life traits, 15 conformation (type) traits, and milk yield measured in first lactation for 128,601 Holstein cows. Heritabilities of all traits were higher in registered than in grade cows. Genetic correlations of linear type traits with first lactation yield ranged from -.48 for udder depth to .54 for dairy form. Genetic correlations among milk yield and herd-life traits were all positive except for milk-corrected herd life in grade cows. Udder traits had largest absolute genetic correlations with herd-life traits, followed by body traits and feet and leg traits. Some traits associated with body size and foot angle differed between registered and grade cows. Estimates of genetic trends from obtained parameters revealed greatest progress for milk yield from single-trait selection but also the largest changes for some type traits and milk-corrected herd life in an undesirable direction. Relative milk to type ratios of between 2:1 and 3:1 yielded 90% of the gain in milk yield with no change or slight improvement in type traits and functional herd life. Selection for type traits associated with herd life appears to be warranted to improve days of functional herd life or to decrease involuntary culling of dairy cows.
Currently, the USDA uses a single-trait (ST) model with several intermediate steps to obtain genomic evaluations for US Holsteins. In this study, genomic evaluations for 18 linear type traits were obtained with a multiple-trait (MT) model using a unified single-step procedure. The phenotypic type data on up to 18 traits were available for 4,813,726 Holsteins, and single nucleotide polymorphism markers from the Illumina BovineSNP50 genotyping Beadchip (Illumina Inc., San Diego, CA) were available on 17,293 bulls. Genomic predictions were computed with several genomic relationship matrices (G) that assumed different allele frequencies: equal, base, current, and current scaled. Computations were carried out with ST and MT models. Procedures were compared by coefficients of determination (R(2)) and regression of 2004 prediction of bulls with no daughters in 2004 on daughter deviations of those bulls in 2009. Predictions for 2004 also included parent averages without the use of genomic information. The R(2) for parent averages ranged from 10 to 34% for ST models and from 12 to 35% for MT models. The average R(2) for all G were 34 and 37% for ST and MT models, respectively. All of the regression coefficients were <1.0, indicating that estimated breeding values in 2009 of 1,307 genotyped young bulls' parents tended to be biased. The average regression coefficients ranged from 0.74 to 0.79 and from 0.75 to 0.80 for ST and MT models, respectively. When the weight for the inverse of the numerator relationship matrix (A(-1)) for genotyped animals was reduced from 1 to 0.7, R(2) remained almost identical while the regression coefficients increased by 0.11-0.26 and 0.12-0.23 for ST and MT models, respectively. The ST models required about 5s per iteration, whereas MT models required 3 (6) min per iteration for the regular (genomic) model. The MT single-step approach is feasible for 18 linear type traits in US Holstein cattle. Accuracy for genomic evaluation increases when switching ST models to MT models. Inflation of genomic evaluations for young bulls could be reduced by choosing a small weight for the A(-1) for genotyped bulls.
Data included 392,800 records for cows born between 1995 and 1997. Traits analyzed were milk, fat, and protein yields, somatic cell score, days open (DO), 18 linear type traits, final score, and several measures of longevity. Productive life (PL) was defined as the total number of days in milk up to 84 mo of age with a restriction of 305, 500, or 999 d per lactation (PL(305), PL(500), or PL(999), respectively). Herd life was defined as the total number of days from the first calving date to the last (culling) date. A multiple-trait sire model including the effects of registration status, herd-year, age group, month of calving and stage of lactation, sire, and residual was used for parameter estimation. The average duration of the first lactation was 365 d for survivors and 386 d for culled cows. Lactation lengths for the survivors in the next 3 parities all exceeded 330 d. Heritability estimates of between 0.08 and 0.10 were obtained for all definitions of longevity. As maximum recordable PL was increased from 305 to 999 d per lactation, the genetic correlations with milk production increased (from -0.11 to +0.14) and with DO decreased (-0.62 to -0.27). Formulas for an indirect prediction of PL from correlated traits were developed. As maximum PL per lactation was increased, little change in the weights used to predict the various measures of PL, with the exception of DO was found. As the currently used value of PL(305) does not properly account for the longer lactation lengths that are routinely occurring with today's cows, PL with longer lactations may be preferable in routine evaluation.
The purpose of this study was to evaluate the accuracy of genomic selection in single-step genomic BLUP (ssGBLUP) when the inverse of the genomic relationship matrix (G) is derived by the "algorithm for proven and young animals" (APY). This algorithm implements genomic recursions on a subset of "proven" animals. Only a relationship matrix for animals treated as "proven" needs to be inverted, and the extra costs of adding animals treated as "young" are linear. Analyses involved 10,102,702 final scores on 6,930,618 Holstein cows. Final score, which is a composite of type traits, is popular trait in the United States and was easily available for this study. A total of 100,000 animals with genotypes were used in the analyses and included 23,000 sires (16,000 with >5 progeny), 27,000 cows, and 50,000 young animals. Genomic EBV (GEBV) were calculated with a regular inverse of G, and with the G inverse approximated by APY. Animals in the proven subset included only sires (23,000), sires+cows (50,000), only cows (27,000), or sires with >5 progeny (16,000). The correlations of GEBV with APY and regular GEBV for young genotyped animals were 0.994, 0.995, 0.992, and 0.992, respectively Later, animals in the proven subset were randomly sampled from all genotyped animals in sets of 2,000, 5,000, 10,000, 15,000, and 20,000; each sample was replicated 4 times. Respective correlations were 0.97 (5,000 sample), 0.98 (10,000 sample), and 0.99 (20,000 sample), with minimal difference between samples of the same size. Genomic EBV with APY were accurate when the number of animals used in the subset is between 10,000 and 20,000, with little difference between the ways of creating the subset. Due to the approximately linear cost of APY, ssGBLUP with APY could support any number of genotyped animals without affecting accuracy.
The objectives of this study were to develop and evaluate an efficient implementation in the computation of the inverse of genomic relationship matrix with the recursion algorithm, called the algorithm for proven and young (APY), in single-step genomic BLUP. We validated genomic predictions for young bulls with more than 500,000 genotyped animals in final score for US Holsteins. Phenotypic data included 11,626,576 final scores on 7,093,380 US Holstein cows, and genotypes were available for 569,404 animals. Daughter deviations for young bulls with no classified daughters in 2009, but at least 30 classified daughters in 2014 were computed using all the phenotypic data. Genomic predictions for the same bulls were calculated with single-step genomic BLUP using phenotypes up to 2009. We calculated the inverse of the genomic relationship matrix GAPY(-1) based on a direct inversion of genomic relationship matrix on a small subset of genotyped animals (core animals) and extended that information to noncore animals by recursion. We tested several sets of core animals including 9,406 bulls with at least 1 classified daughter, 9,406 bulls and 1,052 classified dams of bulls, 9,406 bulls and 7,422 classified cows, and random samples of 5,000 to 30,000 animals. Validation reliability was assessed by the coefficient of determination from regression of daughter deviation on genomic predictions for the predicted young bulls. The reliabilities were 0.39 with 5,000 randomly chosen core animals, 0.45 with the 9,406 bulls, and 7,422 cows as core animals, and 0.44 with the remaining sets. With phenotypes truncated in 2009 and the preconditioned conjugate gradient to solve mixed model equations, the number of rounds to convergence for core animals defined by bulls was 1,343; defined by bulls and cows, 2,066; and defined by 10,000 random animals, at most 1,629. With complete phenotype data, the number of rounds decreased to 858, 1,299, and at most 1,092, respectively. Setting up GAPY(-1) for 569,404 genotyped animals with 10,000 core animals took 1.3h and 57 GB of memory. The validation reliability with APY reaches a plateau when the number of core animals is at least 10,000. Predictions with APY have little differences in reliability among definitions of core animals. Single-step genomic BLUP with APY is applicable to millions of genotyped animals.
Reliability of predictions from single-step genomic BLUP (ssGBLUP) can be calculated by matrix inversion, but that is not feasible for large data sets. Two methods of approximating reliability were developed based on the decomposition of a function of reliability into contributions from records, pedigrees, and genotypes. Those contributions can be expressed in record or daughter equivalents. The first approximation method involved inversion of a matrix that contains inverses of the genomic relationship matrix and the pedigree relationship matrix for genotyped animals. The second approximation method involved only the diagonal elements of those inverses. The 2 approximation methods were tested with a simulated data set. The correlations between ssGBLUP and approximated contributions from genomic information were 0.92 for the first approximation method and 0.56 for the second approximation method; contributions were inflated by 62 and 258%, respectively. The respective correlations for reliabilities were 0.98 and 0.72. After empirical correction for inflation, those correlations increased to 0.99 and 0.89. Approximations of reliabilities of predictions by ssGBLUP are accurate and computationally feasible for populations with up to 100,000 genotyped animals. A critical part of the approximations is quality control of information from single nucleotide polymorphisms and proper scaling of the genomic relationship matrix.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.