When sparse data have to be fitted to a log-linear or latent class model, one cannot use the theoretical chi-square distribution to evaluate model fit, because with sparse data the observed cross-table has too many cells in relation to the number of observations to use a distribution that only holds asymptotically. The choice of a theoretical distribution is also difficult when model-expected frequencies are 0 or when model probabilities are estimated 0 or 1. The authors propose to solve these problems by estimating the distribution of a fit measure, using bootstrap methods. An algorithm is presented for estimating this distribution by drawing bootstrap samples from the model-expected proportions, the so-called nonnaive bootstrap method. For the first time the method is applied to empirical data of varying sparseness, from five different data sets. Results show that the asymptotic chi-square distribution is not at all valid for sparse data.
The focus of this article is on Markov models for the analysis of panel data and, more specifically, on data obtained from repeated measurements of one categorical variable at several consecutive points in time. We first review developments in the field that attack the two main problems of the simple Markov model. The Mixed Markov model extends the simple model by allowing for population heterogeneity; the Latent Markov model incorporates measurement error and latent change into the simple model. Second, we present the more general Latent Mixed Markov model and show how both the Mixed Markov model and the Latent Markov model, as well as several more specific models, relate to this more general model. Finally, we reanalyze the Los Angeles panel data on depression with a focus on stability and change.
In classical test theory the reliability of a test can be estimated by test-retest correlation models. These models do not apply to data of the lowest or nominal measurement level. Instead, models for latent Markov chains may be used to correct for measurement error in panel data from three or more waves. In this article it is shown how to use the E-M algorithm for estimating the parameters of a latent Markov chain. Where previous algorithms performed badly on variables with more than two categories this algorithm performs better, although convergence is often slow. The method is applied to two trichotomous questions from the Dutch civil servants panel survey. Generally the assumptions of the model that is, a latent stationary Markov chain, are reasonably well met by the data. The probability of a correct answer, which can be interpreted as the reliability of a latent response category, is high in most cases (about. 8). Also transition tables are presented that are corrected for measurement error according to the model. Standard errors of model parameters are approximated by a finite difference method.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.