When multilevel models are estimated from survey data derived using multistage sampling, unequal selection probabilities at any stage of sampling may induce bias in standard estimators, unless the sources of the unequal probabilities are fully controlled for in the covariates. This paper proposes alternative ways of weighting the estimation of a two-level model by using the reciprocals of the selection probabilities at each stage of sampling. Consistent estimators are obtained when both the sample number of level 2 units and the sample number of level 1 units within sampled level 2 units increase. Scaling of the weights is proposed to improve the properties of the estimators and to simplify computation. Variance estimators are also proposed. In a limited simulation study the scaled weighted estimators are found to perform well, although non-negligible bias starts to arise for informative designs when the sample number of level 1 units becomes small. The variance estimators perform extremely well. The procedures are illustrated using data from the survey of psychiatric morbidity.
Sons. 1988. xv+301 pages. Price EStg. 44.30 (hardcover).
ABSTRACTshows an elliptical outer shell surrounding two aligned peaks that we interpret as limb-brightened peaks of an optically thin, elliptical shell with an equatorial density enhancement. This mid-IR morphology contrasts with that observed in the better studied carbon-rich protoÈplanetary nebulae, AFGL 2688, AFGL 915, and AFGL 618, which show bright, unresolved cores, probably created by optically thick inner regions, and bipolar extensions that align with their optical reÑection nebulosities. Using an axially symmetric dust code and assuming that the dust is composed of 0.01 km amorphous carbon grains, we model the dust emission images and the spectral energy distributions of these four protoÈplanetary nebulae and of the young, carbon-rich planetary nebula IRAS 21282]5050, which also has an axially symmetric dust shell and other similarities with the protoÈplanetary nebulae that have the 21 km dust feature. Marginally resolved mid-infrared images constrain the dust shellÏs inner radius, while well-resolved mid-infrared images additionally constrain other geometric parameters of the model (e.g., inclination angles and pole-to-equator mass-loss rate ratios). The modeling reveals that the observed axial symmetry in the dust shells of these objects coincides with an enhanced mass-loss phase (D3 ] 10~5 yr~1) during which the equatorial mass-loss rate was a factor of 18È90 higher M _ than the polar mass-loss rate, i.e., an axially symmetric superwind. Our dynamical age estimates indicate that these stars left the asymptotic giant branch approximately 300È1400 years ago, just after the superwind phase. For each object, the size and structure of the dust shell is the same for the sampled wavelengths, with the exception of IRAS 22272]5435 for which the 11.8 km emission is larger than either the 8.2 or the 9.7 km emission. IRAS 22272]5435Ïs spectrum has a larger dust featureÈtoÈdust continuum ratio than found in the other objects, and hence its 11.8 km image is probably dominated by the 11.8 km feature emission that has di †erent optical properties than the underlying continuum.
ABSTRACT. This article considers the assessment of the risk of identification of respondents in survey microdata, in the context of applications at the United Kingdom (UK) Office for National Statistics (ONS). The threat comes from the matching of categorical 'key' variables between microdata records and external data sources and from the use of log-linear models to facilitate matching. While the potential use of such statistical models is well-established in the literature, little consideration has been given to model specification nor to the sensitivity of risk assessment to this specification. In numerical work not reported here, we have found that standard techniques for selecting log-linear models, such as chi-squared goodness of fit tests, provide little guidance regarding the accuracy of risk estimation for the very sparse tables generated by 1 the accuracy of risk estimates. We find that, within a class of 'reasonable' models, risk estimates tend to decrease as the complexity of the model increases. We develop criteria which detect 'underfitting' (associated with overestimation of the risk). The criteria may also reveal 'overfitting' (associated with underestimation) although not so clearly, so we suggest employing a forward model selection approach. Our criteria turn out to be related to established methods of testing for overdispersion in Poisson log-linear models. We show how our approach may be used for both file-level and record-level measures of risk. We evaluate the proposed procedures using samples drawn from the 2001 UK Census where the true risks can be determined and show that a forward selection approach leads to good risk estimates. There are several 'good' models between which our approach provides little discrimination. The risk estimates are found to be stable across these models, implying a form of robustness. We also apply our approach to a large survey dataset. There is no indication that increasing the sample size necessarily leads to the selection of a more complex model. The risk estimates for this application display more variation but suggest a suitable upper bound.
Protection against disclosure is important for statistical agencies releasing microdata files from sample surveys. Simple measures of disclosure risk can provide useful evidence to support decisions about release. We propose a new measure of disclosure risk: the probability that a unique match between a microdata record and a population unit is correct. We argue that this measure has at least two advantages. First, we suggest that it may be a more realistic measure of risk than two measures that are currently used with census data. Second, we show that consistent inference (in a specified sense) may be made about this measure from sample data without strong modelling assumptions. This is a surprising finding, in its contrast with the properties of the two 'similar' established measures. As a result, this measure has potentially useful applications to sample surveys. In addition to obtaining a simple consistent predictor of the measure, we propose a simple variance estimator and show that it is consistent. We also consider the extension of inference to allow for certain complex sampling schemes. We present a numerical study based on 1991 census data for about 450 000 enumerated individuals in one area of Great Britain. We show that the theoretical results on the properties of the point predictor of the measure of risk and its variance estimator hold to a good approximation for these data. Copyright 2002 Royal Statistical Society.
Summary Non‐response is a common source of error in many surveys. Because surveys often are costly instruments, quality‐cost trade‐offs play a continuing role in the design and analysis of surveys. The advances of telephone, computers, and Internet all had and still have considerable impact on the design of surveys. Recently, a strong focus on methods for survey data collection monitoring and tailoring has emerged as a new paradigm to efficiently reduce non‐response error. Paradata and adaptive survey designs are key words in these new developments. Prerequisites to evaluating, comparing, monitoring, and improving quality of survey response are a conceptual framework for representative survey response, indicators to measure deviations thereof, and indicators to identify subpopulations that need increased effort. In this paper, we present an overview of representativeness indicators or R‐indicators that are fit for these purposes. We give several examples and provide guidelines for their use in practice.
No abstract
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.