We consider a semiparametric method to estimate logistic regression models with missing both covariates and an outcome variable, and propose two new estimators. The first, which is based solely on the validation set, is an extension of the validation likelihood estimator of Breslow and Cain (Biometrika 75:11-20, 1988). The second is a joint conditional likelihood estimator based on the validation and nonvalidation data sets. Both estimators are semiparametric as they do not require any model assumptions regarding the missing data mechanism nor the specification of the conditional distribution of the missing covariates given the observed covariates. The asymptotic distribution theory is developed under the assumption that all covariate variables are categorical. The finite-sample properties of the proposed estimators are investigated through simulation studies showing that the joint conditional likelihood estimator is the most efficient. A cable TV survey data set from Taiwan is used to illustrate the practical use of the proposed methodology.
The randomized response technique (RRT) is an important tool that is commonly used to protect a respondent’s privacy and avoid biased answers in surveys on sensitive issues. In this work, we consider the joint use of the unrelated-question RRT of Greenberg et al. (J Am Stat Assoc 64:520–539, 1969) and the related-question RRT of Warner (J Am Stat Assoc 60:63–69, 1965) dealing with the issue of an innocuous question from the unrelated-question RRT. Unlike the existing unrelated-question RRT of Greenberg et al. (1969), the approach can provide more information on the innocuous question by using the related-question RRT of Warner (1965) to effectively improve the efficiency of the maximum likelihood estimator of Scheers and Dayton (J Am Stat Assoc 83:969–974, 1988). We can then estimate the prevalence of the sensitive characteristic by using logistic regression. In this new design, we propose the transformation method and provide large-sample properties. From the case of two survey studies, an extramarital relationship study and a cable TV study, we develop the joint conditional likelihood method. As part of this research, we conduct a simulation study of the relative efficiencies of the proposed methods. Furthermore, we use the two survey studies to compare the analysis results under different scenarios.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.