An important quality aspect of censuses is the degree of coverage of the population. When administrative registers are available undercoverage can be estimated via capture-recapture methodology. The standard approach uses the log-linear model that relies on the assumption that being in the first register is independent of being in the second register. In models using covariates, this assumption of independence is relaxed into independence conditional on covariates. In this article we describe, in a general setting, how sensitivity analyses can be carried out to assess the robustness of the population size estimate. We make use of log-linear Poisson regression using an offset, to simulate departure from the model. This approach can be extended to the case where we have covariates observed in both registers, and to a model with covariates observed in only one register. The robustness of the population size estimate is a function of implied coverage: as implied coverage is low the robustness is low. We conclude that it is important for researchers to investigate and report the estimated robustness of their population size estimate for quality reasons. Extensions are made to log-linear modeling in case of more than two registers and the multiplier method.
Abstract. We are interested in an estimate of the usual residents in the Netherlands. Capture-recapture estimation with three registers enables us to estimate the size of the total population, of which the usual residents are a part. However, usual residence cannot be used as a covariate because it is not available in one of the registers. We approach this as a missing data problem. There are different methods available to handle missing data. In this manuscript we use Expectation Maximization (EM) algorithm and Predictive Mean Matching (PMM). The EM algorithm is often used in categorical data analysis, but PMM has the advantage of flexibility in the choice for a specific part of the observed data used for the imputation of the missing data. Four scenarios have been identified where the missing data are completed via either the EM algorithm or PMM imputation, resulting in different population size estimates for usual residence. It was found that the different scenarios lead to different population size estimates. Even small changes in the completed data lead to different population size estimates. In this study PMM imputation performs best according flexibility and it is theoretically better motivated.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.