The first national single-step, full-information (phenotype, pedigree, and marker genotype) genetic evaluation was developed for final score of US Holsteins. Data included final scores recorded from 1955 to 2009 for 6,232,548 Holsteins cows. BovineSNP50 (Illumina, San Diego, CA) genotypes from the Cooperative Dairy DNA Repository (Beltsville, MD) were available for 6,508 bulls. Three analyses used a repeatability animal model as currently used for the national US evaluation. The first 2 analyses used final scores recorded up to 2004. The first analysis used only a pedigree-based relationship matrix. The second analysis used a relationship matrix based on both pedigree and genomic information (single-step approach). The third analysis used the complete data set and only the pedigree-based relationship matrix. The fourth analysis used predictions from the first analysis (final scores up to 2004 and only a pedigree-based relationship matrix) and prediction using a genomic based matrix to obtain genetic evaluation (multiple-step approach). Different allele frequencies were tested in construction of the genomic relationship matrix. Coefficients of determination between predictions of young bulls from parent average, single-step, and multiple-step approaches and their 2009 daughter deviations were 0.24, 0.37 to 0.41, and 0.40, respectively. The highest coefficient of determination for a single-step approach was observed when using a genomic relationship matrix with assumed allele frequencies of 0.5. Coefficients for regression of 2009 daughter deviations on parent-average, single-step, and multiple-step predictions were 0.76, 0.68 to 0.79, and 0.86, respectively, which indicated some inflation of predictions. The single-step regression coefficient could be increased up to 0.92 by scaling differences between the genomic and pedigree-based relationship matrices with little loss in accuracy of prediction. One complete evaluation took about 2h of computing time and 2.7 gigabytes of memory. Computing times for single-step analyses were slightly longer (2%) than for pedigree-based analysis. A national single-step genetic evaluation with the pedigree relationship matrix augmented with genomic information provided genomic predictions with accuracy and bias comparable to multiple-step procedures and could account for any population or data structure. Advantages of single-step evaluations should increase in the future when animals are pre-selected on genotypes.
Predictive ability of genomic EBV when using single-step genomic BLUP (ssGBLUP) in Angus cattle was investigated. Over 6 million records were available on birth weight (BiW) and weaning weight (WW), almost 3.4 million on postweaning gain (PWG), and over 1.3 million on calving ease (CE). Genomic information was available on, at most, 51,883 animals, which included high and low EBV accuracy animals. Traditional EBV was computed by BLUP and genomic EBV by ssGBLUP and indirect prediction based on SNP effects was derived from ssGBLUP; SNP effects were calculated based on the following reference populations: ref_2k (contains top bulls and top cows that had an EBV accuracy for BiW ≥0.85), ref_8k (contains all parents that were genotyped), and ref_33k (contains all genotyped animals born up to 2012). Indirect prediction was obtained as direct genomic value (DGV) or as an index of DGV and parent average (PA). Additionally, runs with ssGBLUP used the inverse of the genomic relationship matrix calculated by an algorithm for proven and young animals (APY) that uses recursions on a small subset of reference animals. An extra reference subset included 3,872 genotyped parents of genotyped animals (ref_4k). Cross-validation was used to assess predictive ability on a validation population of 18,721 animals born in 2013. Computations for growth traits used multiple-trait linear model and, for CE, a bivariate CE-BiW threshold-linear model. With BLUP, predictivities were 0.29, 0.34, 0.23, and 0.12 for BiW, WW, PWG, and CE, respectively. With ssGBLUP and ref_2k, predictivities were 0.34, 0.35, 0.27, and 0.13 for BiW, WW, PWG, and CE, respectively, and with ssGBLUP and ref_33k, predictivities were 0.39, 0.38, 0.29, and 0.13 for BiW, WW, PWG, and CE, respectively. Low predictivity for CE was due to low incidence rate of difficult calving. Indirect predictions with ref_33k were as accurate as with full ssGBLUP. Using the APY and recursions on ref_4k gave 88% gains of full ssGBLUP and using the APY and recursions on ref_8k gave 97% gains of full ssGBLUP. Genomic evaluation in beef cattle with ssGBLUP is feasible while keeping the models (maternal, multiple trait, and threshold) already used in regular BLUP. Gains in predictivity are dependent on the composition of the reference population. Indirect predictions via SNP effects derived from ssGBLUP allow for accurate genomic predictions on young animals, with no advantage of including PA in the index if the reference population is large. With the APY conditioning on about 10,000 reference animals, ssGBLUP is potentially applicable to a large number of genotyped animals without compromising predictive ability.
Currently, the USDA uses a single-trait (ST) model with several intermediate steps to obtain genomic evaluations for US Holsteins. In this study, genomic evaluations for 18 linear type traits were obtained with a multiple-trait (MT) model using a unified single-step procedure. The phenotypic type data on up to 18 traits were available for 4,813,726 Holsteins, and single nucleotide polymorphism markers from the Illumina BovineSNP50 genotyping Beadchip (Illumina Inc., San Diego, CA) were available on 17,293 bulls. Genomic predictions were computed with several genomic relationship matrices (G) that assumed different allele frequencies: equal, base, current, and current scaled. Computations were carried out with ST and MT models. Procedures were compared by coefficients of determination (R(2)) and regression of 2004 prediction of bulls with no daughters in 2004 on daughter deviations of those bulls in 2009. Predictions for 2004 also included parent averages without the use of genomic information. The R(2) for parent averages ranged from 10 to 34% for ST models and from 12 to 35% for MT models. The average R(2) for all G were 34 and 37% for ST and MT models, respectively. All of the regression coefficients were <1.0, indicating that estimated breeding values in 2009 of 1,307 genotyped young bulls' parents tended to be biased. The average regression coefficients ranged from 0.74 to 0.79 and from 0.75 to 0.80 for ST and MT models, respectively. When the weight for the inverse of the numerator relationship matrix (A(-1)) for genotyped animals was reduced from 1 to 0.7, R(2) remained almost identical while the regression coefficients increased by 0.11-0.26 and 0.12-0.23 for ST and MT models, respectively. The ST models required about 5s per iteration, whereas MT models required 3 (6) min per iteration for the regular (genomic) model. The MT single-step approach is feasible for 18 linear type traits in US Holstein cattle. Accuracy for genomic evaluation increases when switching ST models to MT models. Inflation of genomic evaluations for young bulls could be reduced by choosing a small weight for the A(-1) for genotyped bulls.
The purpose of this study was to compare estimates of genetic parameters for sequential growth of beef cattle using two models and two data sets. Growth curves of Nellore cattle were analyzed using body weights measured at ages 1 (birth weight) to 733 d. Two data samples were created, one with 71,867 records sampled from all herds (MISS), and the other with 74,601 records sampled from herds with no missing traits (NMISS). Records preadjusted to a fixed age were analyzed by a multiple-trait model (MTM), which included the effects of contemporary group, age of dam class, additive direct, additive maternal, and maternal permanent environment. Analyses were by REML, with five traits at a time. The random regression model (RRM) included the effects of age of animal, contemporary group, age of dam class, additive direct, additive maternal, permanent environment, and maternal permanent environment. All effects were modeled as cubic Legendre polynomials. These analyses were also by REML. Shapes of estimates of variances by MTM were mostly similar for both data sets for all except late ages, where estimates for MISS were less regular, and for birth weight with MISS. Genetic correlations among ages for the direct and maternal effects were less smooth with MISS. Genetic correlations between direct and maternal effects were more negative for NMISS, where few sires were maternal grandsires. Parameter estimates with RRM were similar to MTM cept that estimates of variances showed more artifacts for MISS; the estimates of additive direct-maternal correlations were more negative with both data sets and approached -1.0 for some ages with NMISS. When parameters of a growth model obtained by used for genetic evaluation, these parameters should be examined for consistency with parameters from MTM and prior information, and adjustments may be required to eliminate artifacts.
Genomic evaluations can be calculated using a unified procedure that combines phenotypic, pedigree and genomic information. Implementation of such a procedure requires the inverse of the relationship matrix based on pedigree and genomic relationships. The objective of this study was to investigate efficient computing options to create relationship matrices based on genomic markers and pedigree information as well as their inverses. SNP maker information was simulated for a panel of 40 K SNPs, with the number of genotyped animals up to 30 000. Matrix multiplication in the computation of the genomic relationship was by a simple 'do' loop, by two optimized versions of the loop, and by a specific matrix multiplication subroutine. Inversion was by a generalized inverse algorithm and by a LAPACK subroutine. With the most efficient choices and parallel processing, creation of matrices for 30 000 animals would take a few hours. Matrices required to implement a unified approach can be computed efficiently. Optimizations can be either by modifications of existing code or by the use of efficient automatic optimizations provided by open source or third-party libraries.
Data included 392,800 records for cows born between 1995 and 1997. Traits analyzed were milk, fat, and protein yields, somatic cell score, days open (DO), 18 linear type traits, final score, and several measures of longevity. Productive life (PL) was defined as the total number of days in milk up to 84 mo of age with a restriction of 305, 500, or 999 d per lactation (PL(305), PL(500), or PL(999), respectively). Herd life was defined as the total number of days from the first calving date to the last (culling) date. A multiple-trait sire model including the effects of registration status, herd-year, age group, month of calving and stage of lactation, sire, and residual was used for parameter estimation. The average duration of the first lactation was 365 d for survivors and 386 d for culled cows. Lactation lengths for the survivors in the next 3 parities all exceeded 330 d. Heritability estimates of between 0.08 and 0.10 were obtained for all definitions of longevity. As maximum recordable PL was increased from 305 to 999 d per lactation, the genetic correlations with milk production increased (from -0.11 to +0.14) and with DO decreased (-0.62 to -0.27). Formulas for an indirect prediction of PL from correlated traits were developed. As maximum PL per lactation was increased, little change in the weights used to predict the various measures of PL, with the exception of DO was found. As the currently used value of PL(305) does not properly account for the longer lactation lengths that are routinely occurring with today's cows, PL with longer lactations may be preferable in routine evaluation.
Data included 585,119 test-day records for milk, fat, and protein yields from the first, second, and third parities of 38,608 Holsteins in Georgia. Daily temperature-humidity indexes (THI) were available from public weather stations. Models included a repeatability test-day model with a random regression on a function of THI and a test-day random regression model using linear splines with knots at 5, 50, 200, and 305 d in milk and a function of THI. Random effects were additive genetic and permanent environmental in the repeatability model and additive genetic, permanent environmental, and herd year in the random regression model. Additionally, models included fixed effects for herd test day, calving age, milking frequency, and lactation stage. Phenotypic variance increased by 50 to 60% from the first to second parity for all yield traits with the repeatability model and by 12 to 15% from the second to third parity. General additive genetic variance increased by 25 to 35% from the first to second parity for all yield traits but decreased slightly from the second to third parity for milk and protein yields. Genetic variance for heat tolerance doubled from the first to second parity and increased by 20 to 100% from the second to third parity. Genetic correlations among general additive effects were lowest between the first and second parities (0.84 to 0.88) and were highest between the second and third parities (0.96 to 0.98). Genetic correlations among parities for the effect of heat tolerance ranged from 0.56 to 0.79. Genetic correlations between general and heat-tolerance effects across parities and yield traits ranged from -0.30 to -0.50. With the random regression model, genetic variance for heat tolerance for milk yield was approximately one-half that of the repeatability model. For milk yield, the most negative genetic correlation (approximately -0.45) between general and heat-tolerance effects was between 50 and 200 d in milk for the first parity and between 200 and 305 d in milk for the second and third parities. The genetic variance of heat tolerance increased substantially from the first to third parity. Genetic estimates of heat tolerance may be inflated with the repeatability model because of timing of lactations to avoid peak yield during hot seasons.
The purpose of this study was to evaluate the accuracy of genomic selection in single-step genomic BLUP (ssGBLUP) when the inverse of the genomic relationship matrix (G) is derived by the "algorithm for proven and young animals" (APY). This algorithm implements genomic recursions on a subset of "proven" animals. Only a relationship matrix for animals treated as "proven" needs to be inverted, and the extra costs of adding animals treated as "young" are linear. Analyses involved 10,102,702 final scores on 6,930,618 Holstein cows. Final score, which is a composite of type traits, is popular trait in the United States and was easily available for this study. A total of 100,000 animals with genotypes were used in the analyses and included 23,000 sires (16,000 with >5 progeny), 27,000 cows, and 50,000 young animals. Genomic EBV (GEBV) were calculated with a regular inverse of G, and with the G inverse approximated by APY. Animals in the proven subset included only sires (23,000), sires+cows (50,000), only cows (27,000), or sires with >5 progeny (16,000). The correlations of GEBV with APY and regular GEBV for young genotyped animals were 0.994, 0.995, 0.992, and 0.992, respectively Later, animals in the proven subset were randomly sampled from all genotyped animals in sets of 2,000, 5,000, 10,000, 15,000, and 20,000; each sample was replicated 4 times. Respective correlations were 0.97 (5,000 sample), 0.98 (10,000 sample), and 0.99 (20,000 sample), with minimal difference between samples of the same size. Genomic EBV with APY were accurate when the number of animals used in the subset is between 10,000 and 20,000, with little difference between the ways of creating the subset. Due to the approximately linear cost of APY, ssGBLUP with APY could support any number of genotyped animals without affecting accuracy.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.