Pooling ANOVA Results From Multiply Imputed Datasets

Grund, Simon; Lüdtke, Oliver; Robitzsch, Alexander

doi:10.1027/1614-2241/a000111

Cited by 53 publications

(58 citation statements)

References 35 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Imputation has advantages over alternative methods to address missing data in diallel experiments in that it is relatively simple to implement, makes use of all available data for a given trait, and replaces missing data with plausible estimates to avoid reductions in sample size. However, there are several caveats and compromises regarding multiple imputation, namely, that there are inadequate or vague diagnostics and, although simple in principle, methods to pool multi-factor ANOVA results are often vague, or are not widely accessible ( van Ginkel and Kroonenberg 2014 ; Grund et al 2016 ). In this study, we demonstrate the application of existing, relatively straightforward, methods to pool results for diallel analysis across multiple environments.…”

Section: Discussionmentioning

confidence: 99%

Dissecting the Genetic Architecture of Shoot Growth in Carrot (Daucus carotaL.) Using a Diallel Mating Design

Turner

Maurizio

Valdar

et al. 2018

G3 Genes|Genomes|Genetics

View full text Add to dashboard Cite

Crop establishment in carrot (Daucus carota L.) is limited by slow seedling growth and delayed canopy closure, resulting in high management costs for weed control. Varieties with improved growth habit (i.e., larger canopy and increased shoot biomass) may help mitigate weed control, but the underlying genetics of these traits in carrot is unknown. This project used a diallel mating design coupled with recent Bayesian analytical methods to determine the genetic basis of carrot shoot growth. Six diverse carrot inbred lines with variable shoot size were crossed in WI in 2014. F1 hybrids, reciprocal crosses, and parental selfs were grown in a randomized complete block design with two blocks in WI (2015) and CA (2015, 2016). Measurements included canopy height, canopy width, shoot biomass, and root biomass. General and specific combining abilities were estimated using Griffing’s Model I, which is a common analysis for plant breeding experiments. In parallel, additive, inbred, cross-specific, and maternal effects were estimated from a Bayesian mixed model, which is robust to dealing with data imbalance and outliers. Both additive and nonadditive effects significantly influenced shoot traits, with nonadditive effects playing a larger role early in the growing season, when weed control is most critical. Results suggest the presence of heritable variation and thus potential for improvement of these phenotypes in carrot. In addition, results present evidence of heterosis for root biomass, which is a major component of carrot yield.

show abstract

Section: Discussionmentioning

confidence: 99%

Dissecting the Genetic Architecture of Shoot Growth in Carrot (Daucus carotaL.) Using a Diallel Mating Design

Turner

Maurizio

Valdar

et al. 2018

G3 Genes|Genomes|Genetics

View full text Add to dashboard Cite

show abstract

“…Barnard and Rubin (1999) and Reiter (2007) developed improved error degrees of freedom for these combination rules. Simulation studies (Barnard & Rubin, 1999;Grund, L€ udtke, & Robitzsch, 2016;Liu & Enders, 2017;Reiter, 2007;Schafer, 1997) have shown that these combination rules generally give type-I error rates close to the theoretical type-I error rates.…”

Section: Introductionmentioning

confidence: 84%

“…For testing several parameters for significance simultaneously, several solutions are available Meng & Rubin, 1992;Rubin, 1987). Of these solutions, the most promising one Rubin, 1987) according to several simulation studies (Grund et al, 2016;Liu & Enders, 2017;Reiter, 2007) is a set of formulas that are multivariate extensions of Equations (3)-(8).…”

Section: Multiparameter Estimatesmentioning

confidence: 99%

Significance Tests and Estimates for R² for Multiple Regression in Multiply Imputed Datasets: A Cautionary Note on Earlier Findings, and Alternative Solutions

Ginkel

2019

Multivariate Behavioral Research

View full text Add to dashboard Cite

Whenever multiple regression is applied to a multiply imputed data set, several methods for combining significance tests for R 2 and the change in R 2 across imputed data sets may be used: the combination rules by Rubin, the Fisher z-test for R 2 by Harel, and F-tests for the change in R 2 by Chaurasia and Harel. For pooling R 2 itself, Harel proposed a method based on a Fisher z transformation. In the current article, it is argued that the pooled R 2 based on the Fisher z transformation, the Fisher z-test for R 2 , and the F-test for the change in R 2 have some theoretical flaws. An argument is made for using Rubin's method for pooling significance tests for R 2 instead, and alternative procedures for pooling R 2 are proposed: simple averaging and a pooled R 2 constructed from the pooled significance test by Rubin. Simulations show that the Fisher z-test and Chaurasia and Harel's F-tests generally give inflated type-I error rates, whereas the type-I error rates of Rubin's method are correct. Of the methods for pooling the point estimates of R 2 no method clearly performs best, but it is argued that the average of R 2 's across imputed data set is preferred.

show abstract

“…In addition to Rubin's rules (1987), the package implements the procedures commonly referred to as D 1 Reiter, 2007), D 2 (Li, Meng, Raghunathan, & Rubin, 1991), and D 3 (Meng & Rubin, 1992), which can be used for testing a variety of statistical hypotheses that potentially involve multiple parameters simultaneously (e.g., model comparisons Li, Meng, et al (1991) and Meng and Rubin (1992), or on variations thereof (Licht, 2010), but clear recommendations have not yet been made in the literature (see also Consentino & Claeskens, 2010;Grund, Lüdtke, & Robitzsch, 2016b).…”

Section: Discussionmentioning

confidence: 99%

Multiple Imputation of Multilevel Missing Data

2016

Self Cite

View full text Add to dashboard Cite

The treatment of missing data can be difficult in multilevel research because state-of-the-art procedures such as multiple imputation (MI) may require advanced statistical knowledge or a high degree of familiarity with certain statistical software. In the missing data literature, pan has been recommended for MI of multilevel data. In this article, we provide an introduction to MI of multilevel missing data using the R package pan, and we discuss its possibilities and limitations in accommodating typical questions in multilevel research. To make pan more accessible to applied researchers, we make use of the mitml package, which provides a user-friendly interface to the pan package and several tools for managing and analyzing multiply imputed data sets. We illustrate the use of pan and mitml with two empirical examples that represent common applications of multilevel models, and we discuss how these procedures may be used in conjunction with other software.Keywords multiple imputation, missing data, multilevel, R 2 SAGE Open behind pan and MI, and we discuss which features of multilevel models must be considered when conducting MI. Finally, we use the mitml package to carry out MI for the empirical example. In that context, we will discuss possibilities for model diagnostics and tests of statistical hypotheses (e.g., model constraints, model comparisons). Multilevel Modeling: An Empirical ExampleMultilevel models account for dependencies in the data and allow relationships between variables to be estimated at different levels of analysis or effects that may vary across higher level observational units. For the purpose of this article, we assume that the multilevel structure consists of persons (e.g., students, employees) nested within groups (e.g., classes, work groups). If only the regression intercept varies across groups, the model is referred to as a random-intercept model. For example, Chen and Bliese (2002) examined the effects of individual characteristics (e.g., psychological strain) and leadership climate on the self-efficacy of U.S. soldiers. Kunter, Baumert, and Köller (2007) investigated the effects of student-and group-level ratings of classroom management on students' interest in mathematics. If the effects of additional predictor variables vary across groups, the model is referred to as a random-slope or random-coefficients model. For example, Hofmann, Morgeson, and Gerras (2003) investigated varying effects of leader-member exchange on safety behavior across work teams in the U.S. army.The example data set used in this article is from the field of educational research and was taken from the German sample of primary school students who participated in the Progress in International Reading Literacy Study (PIRLS;Bos et al., 2005;Mullis, Martin, Gonzales, & Kennedy, 2003). The data set includes test scores in both mathematics (MA) and reading achievement (RA), a measure of cognitive ability (CA), a measure of socioeconomic status (SES), students' ratings of the quality of teaching in their math...

show abstract

Pooling ANOVA Results From Multiply Imputed Datasets

Cited by 53 publications

References 35 publications

Dissecting the Genetic Architecture of Shoot Growth in Carrot (Daucus carotaL.) Using a Diallel Mating Design

Dissecting the Genetic Architecture of Shoot Growth in Carrot (Daucus carotaL.) Using a Diallel Mating Design

Significance Tests and Estimates for R² for Multiple Regression in Multiply Imputed Datasets: A Cautionary Note on Earlier Findings, and Alternative Solutions

Multiple Imputation of Multilevel Missing Data

Contact Info

Product

Resources

About

Pooling ANOVA Results From Multiply Imputed Datasets

Cited by 53 publications

References 35 publications

Dissecting the Genetic Architecture of Shoot Growth in Carrot (Daucus carotaL.) Using a Diallel Mating Design

Dissecting the Genetic Architecture of Shoot Growth in Carrot (Daucus carotaL.) Using a Diallel Mating Design

Significance Tests and Estimates for R2 for Multiple Regression in Multiply Imputed Datasets: A Cautionary Note on Earlier Findings, and Alternative Solutions

Multiple Imputation of Multilevel Missing Data

Contact Info

Product

Resources

About

Significance Tests and Estimates for R² for Multiple Regression in Multiply Imputed Datasets: A Cautionary Note on Earlier Findings, and Alternative Solutions