Application of classical model selection methods such as Akaike's information criterion AIC becomes problematic when observations are missing. In this paper we propose some variations on the AIC, which are applicable to missing covariate problems. The method is directly based on the EM algorithm and is readily available for EM-based estimation methods, without much additional computational efforts. The missing data-AIC criteria are formally derived and shown to work in a simulation study and by application to data on diabetic retinopathy.
We develop nonparametric tests for the null hypothesis that a function has a prescribed form, to apply to data sets with missing observations. Omnibus nonparametric tests do not need to specify a particular alternative parametric form, and have power against a large range of alternatives, the order selection tests that we study are one example. We extend such order selection tests to be applicable in the context of missing data. In particular, we consider likelihood-based order selection tests for multiplyimputed data. A simulation study and data analysis illustrate the performance of the tests. A model selection method in the style of Akaike's information criterion for multiply imputed datasets results along the same lines.
Summary Application of classical model selection methods such as Akaike's information criterion (AIC) becomes problematic when observations are missing. In this article we propose some variations on the AIC, which are applicable to missing covariate problems. The method is directly based on the expectation maximization (EM) algorithm and is readily available for EM‐based estimation methods, without much additional computational efforts. The missing data AIC criteria are formally derived and shown to work in a simulation study and by application to data on diabetic retinopathy.
We derive explicit formulae for estimation in logistic regression models where some of the covariates are missing. Our approach allows for modelling the distribution of the missing covariates either as a multivariate normal or as a multivariate t-distribution. A main advantage of this method is that it is fast and does not require the use of iterative procedures. A model selection method is derived which allows to choose among these distributions. In addition, we consider versions of Akaike's information criterion that are based on the expectation-maximization algorithm and multiple imputation methods that have a wide applicability to model selection in likelihood models in general.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.