Maize is both an exciting model organism in plant genetics and also the most important crop worldwide for food, animal feed and bioenergy production. Recent genome-wide association and metabolic profiling studies aimed to resolve quantitative traits to their causal genetic loci and key metabolic regulators. Here we present a complementary approach that exploits large-scale genomic and metabolic information to predict complex, highly polygenic traits in hybrid testcrosses. We crossed 285 diverse Dent inbred lines from worldwide sources with two testers and predicted their combining abilities for seven biomass- and bioenergy-related traits using 56,110 SNPs and 130 metabolites. Whole-genome and metabolic prediction models were built by fitting effects for all SNPs or metabolites. Prediction accuracies ranged from 0.72 to 0.81 for SNPs and from 0.60 to 0.80 for metabolites, allowing a reliable screening of large collections of diverse inbred lines for their potential to create superior hybrids.
The diversity of metabolites found in plants is by far greater than in most other organisms. Metabolic profiling techniques, which measure many of these compounds simultaneously, enabled investigating the regulation of metabolic networks and proved to be useful for predicting important agronomic traits. However, little is known about the genetic basis of metabolites in crops such as maize. Here, a set of 289 diverse maize inbred lines was genotyped with 56,110 SNPs and assayed for 118 biochemical compounds in the leaves of young plants, as well as for agronomic traits of mature plants in field trials. Metabolite concentrations had on average a repeatability of 0.73 and showed a correlation pattern that largely reflected their functional grouping. Genome-wide association mapping with correction for population structure and cryptic relatedness identified for 26 distinct metabolites strong associations with SNPs, explaining up to 32.0% of the observed genetic variance. On nine chromosomes, we detected 15 distinct SNP-metabolite associations, each of which explained more then 15% of the genetic variance. For lignin precursors, including p-coumaric acid and caffeic acid, we found strong associations (P values 2:7 × 10 −10 to 3:9 × 10 −18 ) with a region on chromosome 9 harboring cinnamoyl-CoA reductase, a key enzyme in monolignol synthesis and a target for improving the quality of lignocellulosic biomass by genetic engineering approaches. Moreover, lignin precursors correlated significantly with lignin content, plant height, and dry matter yield, suggesting that metabolites represent promising connecting links for narrowing the genotypephenotype gap of complex agronomic traits.genetic association | metabolomics | Zea mays
Genomic prediction is expected to considerably increase genetic gains by increasing selection intensity and accelerating the breeding cycle. In this study, marker effects estimated in 255 diverse maize (Zea mays L.) hybrids were used to predict grain yield, anthesis date, and anthesis-silking interval within the diversity panel and testcross progenies of 30 F2-derived lines from each of five populations. Although up to 25% of the genetic variance could be explained by cross validation within the diversity panel, the prediction of testcross performance of F2-derived lines using marker effects estimated in the diversity panel was on average zero. Hybrids in the diversity panel could be grouped into eight breeding populations differing in mean performance. When performance was predicted separately for each breeding population on the basis of marker effects estimated in the other populations, predictive ability was low (i.e., 0.12 for grain yield). These results suggest that prediction resulted mostly from differences in mean performance of the breeding populations and less from the relationship between the training and validation sets or linkage disequilibrium with causal variants underlying the predicted traits. Potential uses for genomic prediction in maize hybrid breeding are discussed emphasizing the need of (1) a clear definition of the breeding scenario in which genomic prediction should be applied (i.e., prediction among or within populations), (2) a detailed analysis of the population structure before performing cross validation, and (3) larger training sets with strong genetic relationship to the validation set.
Identifying high performing hybrids is an essential part of every maize breeding program. Genomic prediction of maize hybrid performance allows to identify promising hybrids, when they themselves or other hybrids produced from their parents were not tested in field trials. Using simulations, we investigated the effects of marker density (10, 1, 0.3 marker per mega base pair, Mbp(-1)), convergent or divergent parental populations, number of parents tested in other combinations (2, 1, 0), genetic model (including population-specific and/or dominance marker effects or not), and estimation method (GBLUP or BayesB) on the prediction accuracy. We based our simulations on marker genotypes of Central European flint and dent inbred lines from an ongoing maize breeding program. To simulate convergent or divergent parent populations, we generated phenotypes by assigning QTL to markers with similar or very different allele frequencies in both pools, respectively. Prediction accuracies increased with marker density and number of parents tested and were higher under divergent compared with convergent parental populations. Modeling marker effects as population-specific slightly improved prediction accuracy under lower marker densities (1 and 0.3 Mbp(-1)). This indicated that modeling marker effects as population-specific will be most beneficial under low linkage disequilibrium. Incorporating dominance effects improved prediction accuracies considerably for convergent parent populations, where dominance results in major contributions of SCA effects to the genetic variance among inter-population hybrids. While the general trends regarding the effects of the aforementioned influence factors on prediction accuracy were similar for GBLUP and BayesB, the latter method produced significantly higher accuracies for models incorporating dominance.
Intense structuring of plant breeding populations challenges the design of the training set (TS) in genomic selection (GS). An important open question is how the TS should be constructed from multiple related or unrelated small biparental families to predict progeny from individual crosses. Here, we used a set of five interconnected maize (Zea mays L.) populations of doubled-haploid (DH) lines derived from four parents to systematically investigate how the composition of the TS affects the prediction accuracy for lines from individual crosses. A total of 635 DH lines genotyped with 16,741 polymorphic SNPs were evaluated for five traits including Gibberella ear rot severity and three kernel yield component traits. The populations showed a genomic similarity pattern, which reflects the crossing scheme with a clear separation of full sibs, half sibs, and unrelated groups. Prediction accuracies within full-sib families of DH lines followed closely theoretical expectations, accounting for the influence of sample size and heritability of the trait. Prediction accuracies declined by 42% if full-sib DH lines were replaced by half-sib DH lines, but statistically significantly better results could be achieved if half-sib DH lines were available from both instead of only one parent of the validation population. Once both parents of the validation population were represented in the TS, including more crosses with a constant TS size did not increase accuracies. Unrelated crosses showing opposite linkage phases with the validation population resulted in negative or reduced prediction accuracies, if used alone or in combination with related families, respectively. We suggest identifying and excluding such crosses from the TS. Moreover, the observed variability among populations and traits suggests that these uncertainties must be taken into account in models optimizing the allocation of resources in GS.G ENOMIC prediction or selection, initially proposed and rapidly implemented in animal breeding (Meuwissen et al. 2001), is increasingly applied in plant breeding (Bernardo and Yu 2009;Lorenz et al. 2011;Morrell et al. 2011). However, with more investigations in plant breeding applications, new challenges emerge, mainly as the result of the greater possibilities of genetic manipulation and reproduction modes in plants compared to animals. Genomic predictions within diverse populations (Crossa et al. 2010; Riedelsheimer et al. 2012a,b;Windhausen et al. 2012) largely overlap with the scenarios in animal breeding. Predicting crossbred performance in animals also has some similarities with the prediction of maize hybrids from fully homozygous inbred lines drawn from genetically distant heterotic pools (Technow et al. 2012). Apart from using mixtures of close and distant relatives, little overlap with the field of animal breeding however exists in the case of multiple related or unrelated biparental families of inbred lines with a size #200. The latter situation is investigated in this study, because it covers the most rele...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.