Genotype-by-environment interaction (GEI) is an important phenomenon in plant breeding. This paper presents a series of models for describing, exploring, understanding, and predicting GEI. All models depart from a two-way table of genotype by environment means. First, a series of descriptive and explorative models/approaches are presented: Finlay–Wilkinson model, AMMI model, GGE biplot. All of these approaches have in common that they merely try to group genotypes and environments and do not use other information than the two-way table of means. Next, factorial regression is introduced as an approach to explicitly introduce genotypic and environmental covariates for describing and explaining GEI. Finally, QTL modeling is presented as a natural extension of factorial regression, where marker information is translated into genetic predictors. Tests for regression coefficients corresponding to these genetic predictors are tests for main effect QTL expression and QTL by environment interaction (QEI). QTL models for which QEI depends on environmental covariables form an interesting model class for predicting GEI for new genotypes and new environments. For realistic modeling of genotypic differences across multiple environments, sophisticated mixed models are necessary to allow for heterogeneity of genetic variances and correlations across environments. The use and interpretation of all models is illustrated by an example data set from the CIMMYT maize breeding program, containing environments differing in drought and nitrogen stress. To help readers to carry out the statistical analyses, GenStat® programs, 15th Edition and Discovery® version, are presented as “Appendix.”
BackgroundGenome-wide association studies (GWAS) based on linkage disequilibrium (LD) provide a promising tool for the detection and fine mapping of quantitative trait loci (QTL) underlying complex agronomic traits. In this study we explored the genetic basis of variation for the traits heading date, plant height, thousand grain weight, starch content and crude protein content in a diverse collection of 224 spring barleys of worldwide origin. The whole panel was genotyped with a customized oligonucleotide pool assay containing 1536 SNPs using Illumina's GoldenGate technology resulting in 957 successful SNPs covering all chromosomes. The morphological trait "row type" (two-rowed spike vs. six-rowed spike) was used to confirm the high level of selectivity and sensitivity of the approach. This study describes the detection of QTL for the above mentioned agronomic traits by GWAS.ResultsPopulation structure in the panel was investigated by various methods and six subgroups that are mainly based on their spike morphology and region of origin. We explored the patterns of linkage disequilibrium (LD) among the whole panel for all seven barley chromosomes. Average LD was observed to decay below a critical level (r2-value 0.2) within a map distance of 5-10 cM. Phenotypic variation within the panel was reasonably large for all the traits. The heritabilities calculated for each trait over multi-environment experiments ranged between 0.90-0.95. Different statistical models were tested to control spurious LD caused by population structure and to calculate the P-value of marker-trait associations. Using a mixed linear model with kinship for controlling spurious LD effects, we found a total of 171 significant marker trait associations, which delineate into 107 QTL regions. Across all traits these can be grouped into 57 novel QTL and 50 QTL that are congruent with previously mapped QTL positions.ConclusionsOur results demonstrate that the described diverse barley panel can be efficiently used for GWAS of various quantitative traits, provided that population structure is appropriately taken into account. The observed significant marker trait associations provide a refined insight into the genetic architecture of important agronomic traits in barley. However, individual QTL account only for a small portion of phenotypic variation, which may be due to insufficient marker coverage and/or the elimination of rare alleles prior to analysis. The fact that the combined SNP effects fall short of explaining the complete phenotypic variance may support the hypothesis that the expression of a quantitative trait is caused by a large number of very small effects that escape detection. Notwithstanding these limitations, the integration of GWAS with biparental linkage mapping and an ever increasing body of genomic sequence information will facilitate the systematic isolation of agronomically important genes and subsequent analysis of their allelic diversity.
Heritability is a central parameter in quantitative genetics, from both an evolutionary and a breeding perspective. For plant traits heritability is traditionally estimated by comparing within-and between-genotype variability. This approach estimates broad-sense heritability and does not account for different genetic relatedness. With the availability of high-density markers there is growing interest in marker-based estimates of narrow-sense heritability, using mixed models in which genetic relatedness is estimated from genetic markers. Such estimates have received much attention in human genetics but are rarely reported for plant traits. A major obstacle is that current methodology and software assume a single phenotypic value per genotype, hence requiring genotypic means. An alternative that we propose here is to use mixed models at the individual plant or plot level. Using statistical arguments, simulations, and real data we investigate the feasibility of both approaches and how these affect genomic prediction with the best linear unbiased predictor and genome-wide association studies. Heritability estimates obtained from genotypic means had very large standard errors and were sometimes biologically unrealistic. Mixed models at the individual plant or plot level produced more realistic estimates, and for simulated traits standard errors were up to 13 times smaller. Genomic prediction was also improved by using these mixed models, with up to a 49% increase in accuracy. For genome-wide association studies on simulated traits, the use of individual plant data gave almost no increase in power. The new methodology is applicable to any complex trait where multiple replicates of individual genotypes can be scored. This includes important agronomic crops, as well as bacteria and fungi.KEYWORDS marker-based estimation of heritability; GWAS; genomic prediction; Arabidopsis thaliana; one-vs. two-stage approaches N ARROW-SENSE heritability is an important parameter in quantitative genetics, determining the response to selection and representing the proportion of phenotypic variance that is due to additive genetic effects (Jacquard 1983;Ritland 1996;Visscher et al. 2006Visscher et al. , 2008Holland et al. 2010;Sillanpaa 2011). This definition of heritability goes back to Fisher (1918) and Wright (1920) almost a century ago. In plant species for which replicates of the same genotype are available (inbred lines, doubled haploids, clones), a different form of heritability, broadsense heritability, is traditionally estimated by the intraclass correlation coefficient for genotypic effects, using estimates for within-and between-genotype variance. Broad-sense heritability is also referred to as repeatability and gives the proportion of phenotypic variance explained by heritable (additive) and nonheritable (dominance, epistasis) genetic variance.With the arrival of high-density genotyping there is growing interest in marker-based estimation of narrow-sense heritability (WTCCC 2007;Yang et al. 2010Yang et al. , 2011Vatti...
Association or linkage disequilibrium (LD)-based mapping strategies are receiving increased attention for the identification of quantitative trait loci (QTL) in plants as an alternative to more traditional, purely linkage-based approaches. An attractive property of association approaches is that they do not require specially designed crosses between inbred parents, but can be applied to collections of genotypes with arbitrary and often unknown relationships between the genotypes. A less obvious additional attractive property is that association approaches offer possibilities for QTL identification in crops with hard to model segregation patterns. The availability of candidate genes and targeted marker systems facilitates association approaches, as will appropriate methods of analysis. We propose an association mapping approach based on mixed models with attention to the incorporation of the relationships between genotypes, whether induced by pedigree, population substructure, or otherwise. Furthermore, we emphasize the need to pay attention to the environmental features of the data as well, i.e., adequate representation of the relations among multiple observations on the same genotypes. We illustrate our modeling approach using 25 years of Dutch national variety list data on late blight resistance in the genetically complex crop of potato. As markers, we used nucleotide binding-site markers, a specific type of marker that targets resistance or resistance-analog genes. To assess the consistency of QTL identified by our mixed-model approach, a second independent data set was analyzed. Two markers were identified that are potentially useful in selection for late blight resistance in potato.
A good statistical analysis of genotype × environment interactions (G × E) is a key requirement for progress in any breeding program. Data for G × E analyses traditionally come from multi‐environment trials. In recent years, increasingly data are generated from managed stress trials, phenotyping platforms, and high throughput phenotyping techniques in the field. Simultaneously, and complementary to the phenotyping, more elaborate genotyping and envirotyping occur. All of these developments further increase the importance of a sound statistical framework for analyzing G × E. This paper presents considerations on such a framework from the point of view of the choices that need to be made with respect to the content of short academic courses on statistical methods for G × E. Based on our experiences in teaching statistical methods to plant breeders, for specialized G × E courses between three and 5 d are reserved. The audience in such courses includes MSc students, PhD students, postdocs, and researchers at breeding companies. For such specialized courses, we propose a collection of topics to be covered. Our outlook on G × E analyses is two‐fold. On the one hand, we see the G × E problem as the building of predictive models for genotype‐specific reaction norms. On the other hand, the G × E problem consists in the identification of suitable variance‐covariance models to describe heterogeneity of genetic variance and correlations across environments. Our preferred class of statistical models is the class of mixed linear‐bilinear models. These statistical models allow us to answer breeding questions on adaptation, adaptability, stability, and the identification and subdivision of the target population of environments. By a citation analysis of the literature on G × E, we show that our preference for mixed linear‐bilinear models for analyzing G × E is supported by recent trends in the types of methods for G × E analysis that are most frequently cited.
Despite QTL mapping being a routine procedure in plant breeding, approaches that fully exploit data from multi-trait multi-environment (MTME) trials are limited. Mixed models have been proposed both for multi-trait QTL analysis and multi-environment QTL analysis, but these approaches break down when the number of traits and environments increases. We present models for an efficient QTL analysis of MTME data with mixed models by reducing the dimensionality of the genetic variance-covariance matrix by structuring this matrix using direct products of relatively simple matrices representing variation in the trait and environmental dimension. In the context of MTME data, we address how to model QTL by environment interactions and the genetic basis of heterogeneity of variance and correlations between traits and environments. We illustrate our approach with an example including five traits across eight stress trials in CIMMYT maize. We detected 36 QTLs affecting yield, anthesis-silking interval, male flowering, ear number, and plant height in maize. Our approach does not require specialised software as it can be implemented in any statistical package with mixed model facilities.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.