In this paper we consider estimating heterogeneity variance with the DerSimonian-Laird (DSL) estimator as typically used in meta-analysis. In its general form the DSL estimator requires inverse population-averaged study-specific variances as weights, in which case the estimator is unbiased. It has become common practice, however, to use estimates of the study-specific variances instead of their population-averaged versions. This can lead to considerable bias. Simulations illustrate these findings.
Record linkage is an invaluable tool for many epidemiological studies. So far, however, in Europe it has mainly been used in Scandinavian countries with nationwide registration systems. Using the example of a large retrospective cohort study among migrants, we show that this method is also feasible in Germany, a country without central registries and unique identification numbers for citizens.
This note generalizes Chao's estimator of population size for closed capture-recapture studies if covariates are available. Chao's estimator was developed under unobserved heterogeneity in which case it represents a lower bound of the population size. If observed heterogeneity is available in form of covariates we show how this information can be used to reduce the bias of Chao's estimator. The key element in this development is the understanding and placement of Chao's estimator in a truncated Poisson likelihood. It is shown that a truncated Poisson likelihood (with log-link) with all counts truncated besides ones and twos is equivalent to a binomial likelihood (with logit-link). This enables the development of a generalized Chao estimator as the estimated, expected value of the frequency of zero counts under a truncated (all counts truncated except ones and twos) Poisson regression model. If the regression model accounts for the heterogeneity entirely, the generalized Chao estimator is asymptotically unbiased. A simulation study illustrates the potential in gain of bias reduction. Comparisons of the generalized Chao estimator with the homogeneous zero-truncated Poisson regression approach are supplied as well. The method is applied to a surveillance study on the completeness of farm submissions in Great Britain.
Proportion estimators are quite frequently used in many application areas. The conventional proportion estimator (number of events divided by sample size) encounters a number of problems when the data are sparse as will be demonstrated in various settings. The problem of estimating its variance when sample sizes become small is rarely addressed in a satisfying framework. Specifically, we have in mind applications like the weighted risk difference in multicenter trials or stratifying risk ratio estimators (to adjust for potential confounders) in epidemiological studies. It is suggested to estimate p using the parametric family p(c) and p(1 - p) using p(c)(1 - p(c)), where p(c) = (X + c)/(n + 2c). We investigate the estimation problem of choosing c > or = 0 from various perspectives including minimizing the average mean squared error of p(c), average bias and average mean squared error of p(c)(1 - p(c)). The optimal value of c for minimizing the average mean squared error of p(c) is found to be independent of n and equals c = 1. The optimal value of c for minimizing the average mean squared error of p(c)(1 - p(c)) is found to be dependent of n with limiting value c = 0.833. This might justify to use a near-optimal value of c = 1 in practice which also turns out to be beneficial when constructing confidence intervals of the form p(c)+/-1.96 square root of np(c)(1 - p(c))/(n + 2c).
The purpose of the study is to estimate the population size under a homogeneous truncated count model and under model contaminations via the Horvitz-Thompson approach on the basis of a count capture-recapture experiment. The proposed estimator is based on a mixture of zero-truncated Poisson distributions. The benefit of using the proposed model is statistical inference of the long-tailed or skewed distributions and the concavity of the likelihood function with strong results available on the nonparametric maximum likelihood estimator (NPMLE). The results of comparisons, for finding the appropriate estimator among McKendrick's, Mantel-Haenszel's, Zelterman's, Chao's, the maximum likelihood, and the proposed methods in a simulation study, reveal that under model contaminations the proposed estimator provides the best choice according to its smallest bias and smallest mean square error for a situation of sufficiently large population sizes and the further results show that the proposed estimator performs well even for a homogeneous situation. The empirical examples, containing the cholera epidemic in India based on homogeneity and the heroin user data in Bangkok 2002 based on heterogeneity, are fitted with an excellent goodness-of-fit of the models and the confidence interval estimations may also be of considerable interest.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.