2008
DOI: 10.1111/j.1467-9531.2008.00202.x
|View full text |Cite
|
Sign up to set email alerts
|

9. Multiple Imputation of Incomplete Categorical Data Using Latent Class Analysis

Abstract: We propose using latent class analysis as an alternative to loglinear analysis for the multiple imputation of incomplete categorical data. Similar to log-linear models, latent class models can be used to describe complex association structures between the variables used in the imputation model. However, unlike loglinear models, latent class models can be used to build large imputation models containing more than a few categorical variables. To obtain imputations reflecting uncertainty about the unknown model p… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
126
0
2

Year Published

2010
2010
2023
2023

Publication Types

Select...
6
2

Relationship

2
6

Authors

Journals

citations
Cited by 93 publications
(130 citation statements)
references
References 43 publications
(79 reference statements)
2
126
0
2
Order By: Relevance
“…For categorical variables, the log-linear model (Schafer, 1997) would be the most appropriate imputation model. However, the log-linear model can be applied only when the number of variables used in the imputation model is small (Vermunt, van Ginkel, van der Ark, & Sijtsma, 2008), whereby it is feasible to set up and process the full multiway cross-tabulation required for the log-linear analysis. As an alternative, imputations can be carried out under a fully conditional specification approach (van Buuren, 2007), where a sequence of regression models for the univariate conditional distributions of the variables with missing values are specified.…”
Section: Multiple Imputation (Mi)mentioning
confidence: 99%
“…For categorical variables, the log-linear model (Schafer, 1997) would be the most appropriate imputation model. However, the log-linear model can be applied only when the number of variables used in the imputation model is small (Vermunt, van Ginkel, van der Ark, & Sijtsma, 2008), whereby it is feasible to set up and process the full multiway cross-tabulation required for the log-linear analysis. As an alternative, imputations can be carried out under a fully conditional specification approach (van Buuren, 2007), where a sequence of regression models for the univariate conditional distributions of the variables with missing values are specified.…”
Section: Multiple Imputation (Mi)mentioning
confidence: 99%
“…In this article, a different alternative approach based on LCA was taken which classified students regarding their background characteristics and used these classifications as predictors in the LRM. This approach is similar to applying LCA directly as an imputation model for categorical data which has been shown to yield good recovery of parameters and standard errors when a sufficient number of latent classes is used (Vermunt et al, 2008). Since the parameters of interest in large-scale educational surveys are student proficiencies, we used LCA as a method of obtaining predictors for the LRM.…”
Section: Discussionmentioning
confidence: 99%
“…Vermunt et al (2008) note that in the context of using LCA for the purposes of density estimation, fitting a larger number of classes than necessary (overfitting) is advantageous since sample specific variability can be captured whereas underfitting is problematic because important associations or interactions between variables are ignored. In this study, the LCA approaches were evaluated with respect to how well they recovered the results from the operational approach since this is the method currently used in most large-scale educational surveys and it has been shown to provide unbiased estimates (e.g., Oranje & Ye, 2014).…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…In order to minimise any loss of information, multiple imputation is performed (Rubin 1987;Vermunt, Van Ginkel, Van der Ark and Sijtsma 2008 ). A highly important methodological challenge of this kind of complex model is the sparseness of data.…”
Section: 1mentioning
confidence: 99%