S ince the 1980s, molecular markers have largely been considered as an add-on to cultivar development. Marker applications for quantitative traits have been investigated in the context of the question: "Given the current methods for breeding crops, how can molecular markers enhance breeding progress?" Viewing markers primarily as an aid to selection was a natural consequence of the high cost of genotyping for molecular markers. In the 1990s, for example, the cost of genotyping one sample for one restriction fragment length polymorphism (RFLP) or simple sequence repeat (SSR) marker (i.e., one data point) was more than US$1 (Linkage Genetics, pers. comm.; Biogenetic Services, pers. comm.).Advances in high-throughput genotyping have markedly reduced the cost per data point of molecular markers. This reduction was mainly the result of three parallel developments ( Jenkins and Gibson, 2002;Syvänen, 2005): (i) the discovery of vast numbers of single nucleotide polymorphism (SNP) markers in many species; (ii) development of high-throughput technologies, such as multiplexing and gel-free DNA arrays, for screening SNP polymorphisms; and (iii) automation of the marker-genotyping process, including streamlined procedures for DNA extraction. ABSTRACTThe availability of cheap and abundant molecular markers in maize (Zea mays L.) has allowed breeders to ask how molecular markers may best be used to achieve breeding progress, without conditioning the question on how breeding has traditionally been done. Genomewide selection refers to marker-based selection without fi rst identifying a subset of markers with significant effects. Our objectives were to assess the response due to genomewide selection compared with marker-assisted recurrent selection (MARS) and to determine the extent to which phenotyping can be minimized and genotyping maximized in genomewide selection. We simulated genomewide selection by evaluating doubled haploids for testcross performance in Cycle 0, followed by two cycles of selection based on markers. Individuals were genotyped for N M markers, and breeding values associated with each of the N M markers were predicted and were all used in genomewide selection. We found that across different numbers of quantitative trait loci (20, 40, and 100) and levels of heritability, the response to genomewide selection was 18 to 43% larger than the response to MARS. Responses to selection were maintained when the number of doubled haploids phenotyped and genotyped in Cycle 0 was reduced and the number of plants genotyped in Cycles 1 and 2 was increased. Such schemes that minimize phenotyping and maximize genotyping would be feasible only if the cost per marker data point is reduced to about 2 cents. The convenient but incorrect assumption of equal marker variances led to only a minimal loss in the response to genomewide selection. We conclude that genomewide selection, as a brute-force and black-box procedure that exploits cheap and abundant molecular markers, is superior to MARS in maize.
In the mid‐1980s, the development of abundant molecular markers, appropriate statistical procedures, and user‐friendly computer software that implemented these statistical procedures permitted the detection of molecular markers associated with quantitative trait loci (QTL) for complex traits. Marker‐assisted selection was then proposed as a means of exploiting markers linked to QTL to develop improved cultivars. But while thousands of marker‐trait associations have been reported for many traits in different plant species, far fewer examples of successfully exploiting mapped QTL have been reported in the literature. Key lessons learned from applying markers in plant breeding include the following: (i) the purpose of detecting QTL should be clearly defined before embarking on QTL mapping; (ii) procedures for marker‐based selection depend on the number of QTL; (iii) estimates of QTL effects for complex traits are often inconsistent; and (iv) gain per unit cost and time rather than gain per cycle should be considered. Future applications for complex traits will likely focus on predictive methodologies for marker‐based selection before phenotyping and for marker‐based selection without QTL mapping. These applications will take advantage of cheaper costs of genotyping than of phenotyping.
The availability of cheap and abundant molecular markers has led to plant-breeding methods that rely on the prediction of genotypic value from marker data, but published information is lacking on the accuracy of genotypic value predictions with empirical data in plants. Our objectives were to (1) determine the accuracy of genotypic value predictions from multiple linear regression (MLR) and genomewide selection via best linear unbiased prediction (BLUP) in biparental plant populations; (2) assess the accuracy of predictions for different numbers of markers (N(M)) and progenies (N(P)) used in estimation; and (3) determine if an empirical Bayes approach for modeling of the variances of individual markers and of epistatic effects leads to more accurate predictions in empirical data. We divided each of four maize (Zea mays L.) datasets, one Arabidopsis dataset, and two barley (Hordeum vulgare L.) datasets into an estimation set, where marker effects were calculated, and a test set, where genotypic values were predicted based on markers. Predictions were more accurate with BLUP than with MLR. Predictions became more accurate as N(P) and N(M) increased, until sufficient genome coverage was reached. Modeling marker variances with the empirical Bayes method sometimes led to slightly better predictions, but the accuracy with different variants of the empirical Bayes method was often inconsistent. In nearly all cases, the accuracy with BLUP was not significantly different from the highest accuracy across all methods. Accounting for epistasis in the empirical Bayes procedure led to poorer predictions. We concluded that among the methods considered, the quick and simple BLUP approach is the method of choice for predicting genotypic value in biparental plant populations.
Methods for predicting hybrid yield would facilitate the identification of superior maize (Zea mays L.) single crosses. Best linear unbiased prediction of the performance of single crosses, based on (i) restriction fragment length polymorphism (RFLP) data on the parental inbreds and (ii) yield data on a related set of single crosses, was evaluated. Yields of m single crosses were predicted as YM = C V−1 yP, where: yM = m × 1 vector of predicted yields of missing (i.e., no yield data available) single crosses; C = m × n matrix of genetic covariances between the missing and predictor hybrids; V = n × n matrix of phenotypic variances and covariances among predictor hybrids; and yP = nn × 1 vector of predictor hybrid yields corrected for trial effects. From a set of 54 single crosses, made between six Iowa Stiff Stalk Synthetic (SSS) and nine non‐SSS inbreds, 100 different sets of n = 10, 15, 20, 25, or 30 predictor hybrids were chosen at random. Pooled correlations between predicted and observed yields of the remaining (54 − n) hybrids ranged from 0.654 to 0.800. The correlations were slightly higher when dominance variance was included in the model or when coefficients of coancestry were determined from RFLP rather than pedigree data. The correlations remained relatively stable across different, arbitrary values of genetic variances. The results suggested that single‐cross yield can be predicted effectively based on parental RFLP data and yields of a related set of hybrids.
Current methods for genomewide selection do not distinguish between known major genes and random genomewide markers. My objectives were to determine if explicitly modeling the effects of known major genes affects the response to genomewide selection, and to identify situations in which considering major genes as having fixed effects is helpful. Simulation experiments showed that having a fixed effect for a major gene became more advantageous as the percentage of genetic variance (VG) explained by a major gene (R2) increased and as the heritability on an entry‐mean basis (h2) increased. With R2 = 50% and h2 = 0.80, the relative efficiency (based on selection gains in Cycle 4) with a major gene having a fixed versus random effect was 112–121%. Specifying a fixed effect for a single major gene was never disadvantageous except with R2 < 10%. With h2 ≥ 0.50, specifying a fixed versus random effect for a single major gene had little effect on prediction accuracy in Cycle 0. However, prediction accuracy in later cycles declined more rapidly when a major gene had a random effect instead of a fixed effect. The results with L = 2 or 3 major genes were similar to those with one major gene. In contrast, the usefulness of gene information was low with L = 10 major genes. Overall, major genes should be fitted as having fixed effects in genomewide selection when only a few major genes are present and each major gene accounts for ≥10% of VG.
The 7.4 million plant accessions in gene banks are largely underutilized due to various resource constraints, but current genomic and analytic technologies are enabling us to mine this natural heritage. Here we report a proof-of-concept study to integrate genomic prediction into a broad germplasm evaluation process. First, a set of 962 biomass sorghum accessions were chosen as a reference set by germplasm curators. With high throughput genotyping-by-sequencing (GBS), we genetically characterized this reference set with 340,496 single nucleotide polymorphisms (SNPs). A set of 299 accessions was selected as the training set to represent the overall diversity of the reference set, and we phenotypically characterized the training set for biomass yield and other related traits. Cross-validation with multiple analytical methods using the data of this training set indicated high prediction accuracy for biomass yield. Empirical experiments with a 200-accession validation set chosen from the reference set confirmed high prediction accuracy. The potential to apply the prediction model to broader genetic contexts was also examined with an independent population. Detailed analyses on prediction reliability provided new insights into strategy optimization. The success of this project illustrates that a global, cost-effective strategy may be designed to assess the vast amount of valuable germplasm archived in 1,750 gene banks.
M arker-assisted selection for quantitative traits has traditionally relied on fi rst identifying markers linked to quantitative trait loci (QTL). A specifi c form of marker-assisted selection in maize (Zea mays L.) is marker-assisted recurrent selection (MARS) in which (i) one generation of phenotypic selection in the target environment is conducted, (ii) markers with signifi cant eff ects are used to predict the performance of individual plants, and (iii) several generations of marker-only selection are performed in a year-round nursery or greenhouse. Empirical results from private breeding programs have shown MARS to be eff ective at improving quantitative traits in maize, soybean [Glycine max (L.) Merr.], and sunfl ower (Helianthus annuus L.) ( Johnson, 2004;Eathington et al., 2007). Specifi cally, gains from selection in 248 maize breeding populations were more than twice as large with MARS than with standard phenotypic selection methods (Eathington et al., 2007).In contrast to previous MARS or other QTL-based selection strategies, genomewide selection (GWS, also called genomic selection; Meuwissen et al., 2001) does not involve tests of signifi cance and uses all available markers to predict performance. Simulation ABSTRACT Genomewide selection (GWS) is markerassisted selection without identifying markers with signifi cant effects. Our previous work with the intermated B73 × Mo17 maize (Zea mays L.) population revealed signifi cant variation for grain yield and stover-quality traits important for cellulosic ethanol production. Our objectives were to determine (i) if realized gains from selection are larger with GWS than with markerassisted recurrent selection (MARS), which involves selection for markers with signifi cant effects; and (ii) how multiple traits respond to multiple cycles of GWS and MARS. In 2007, testcrosses of 223 recombinant inbreds developed from B73 × Mo17 (Cycle 0) were evaluated at four Minnesota locations and genotyped with 287 single nucleotide polymorphism markers. Individuals with the best performance for a Stover Index and a Yield + Stover Index were intermated to form Cycle 1. Both GWS and MARS were then conducted until Cycle 3. Multilocation trials in 2010 indicated that gains for the Stover Index and Yield + Stover Index were 14 to 50% larger (signifi cant at P = 0.05) with GWS than with MARS. Gains in individual traits were mostly nonsignifi cant. Inbreeding coeffi cients ranged from 0.28 to 0.38 by Cycle 3 of GWS and MARS. For stover-quality traits, correlations between wet chemistry and near-infrared refl ectance spectroscopy predictions decreased after selection. We believe this is the fi rst published report of a GWS experiment in crops, and our results indicate that using all available markers for predicting genotypic value leads to greater gain than using a subset of markers with significant effects.
Maize (Zea mays L.) breeders evaluate many single-cross hybrids each year in multiple environments. Our objective was to determine the usefulness of genomewide predictions, based on marker effects from maize single-cross data, for identifying the best untested single crosses and the best inbreds within a biparental cross. We considered 479 experimental maize single crosses between 59 Iowa Stiff Stalk Synthetic (BSSS) inbreds and 44 non-BSSS inbreds. The single crosses were evaluated in multilocation experiments from 2001 to 2009 and the BSSS and non-BSSS inbreds had genotypic data for 669 single nucleotide polymorphism (SNP) markers. Single-cross performance was predicted by a previous best linear unbiased prediction (BLUP) approach that utilized marker-based relatedness and information on relatives, and from genomewide marker effects calculated by ridge-regression BLUP (RR-BLUP). With BLUP, the mean prediction accuracy (r(MG)) of single-cross performance was 0.87 for grain yield, 0.90 for grain moisture, 0.69 for stalk lodging, and 0.84 for root lodging. The BLUP and RR-BLUP models did not lead to r(MG) values that differed significantly. We then used the RR-BLUP model, developed from single-cross data, to predict the performance of testcrosses within 14 biparental populations. The r(MG) values within each testcross population were generally low and were often negative. These results were obtained despite the above-average level of linkage disequilibrium, i.e., r(2) between adjacent markers of 0.35 in the BSSS inbreds and 0.26 in the non-BSSS inbreds. Overall, our results suggested that genomewide marker effects estimated from maize single crosses are not advantageous (cofmpared with BLUP) for predicting single-cross performance and have erratic usefulness for predicting testcross performance within a biparental cross.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.