Outliers are a well-known problem in survey estimation, and a variety of approaches have been suggested for dealing with them in this context. However, when the focus is on small area estimation using the survey data, much less is known -even though outliers within a small area sample are clearly much more influential than they are in the larger overall sample. To the best of our knowledge, Chambers and Tzavidis (2006) was the first published paper in small area estimation that explicitly addressed the issue of outlier robustness, using an approach based on fitting outlier robust M-quantile models to the survey data. Recently, Sinha and Rao (2009) have also addressed this issue from the perspective of linear mixed models. Both these approaches, however, use plug-in robust prediction. That is, they replace parameter estimates in optimal, but outlier sensitive, predictors by outlier robust versions. Unfortunately, this approach may involve an unacceptable prediction bias (but a low prediction variance) in situations where the outliers are drawn from a distribution that has a different mean to the rest of the survey data (Chambers, 1986), which then leads to the suggestion that outlier robust prediction should include an additional term that makes a correction for this bias.In this paper, we explore the extension of this idea to the small area estimation situation and we propose two different analytical mean squared error (MSE) estimators for outlier robust predictors of small area means. We use simulation based on realistic outlier contaminated data to evaluate how the extended small area estimation approach compares with the plug-in robust methods described earlier. The empirical results show that the biascorrected predictive estimators are less biased than the projective estimators especially when there are outliers in the area effects. Moreover, in the simulation experiments we contrast the proposed MSE estimators with those generally utilized for the plug-in robust predictors. The proposed bias-robust and linearization-based MSE estimators appear to perform well when used with the robust predictors of small area means that are considered in this paper.
Small area estimation, Spatial correlation, SAR model, Spatial EBLUP, Lattice data,
SummaryMultilevel modelling is a popular approach for longitudinal data analysis. Statistical models conventionally target a parameter at the centre of a distribution. However, when the distribution of the data is asymmetric, modelling other location parameters, e.g. percentiles, may be more informative. We present a new approach, M‐quantile random‐effects regression, for modelling multilevel data. The proposed method is used for modelling location parameters of the distribution of the strengths and difficulties questionnaire scores of children in England who participate in the Millennium Cohort Study. Quantile mixed models are also considered. The analyses offer insights to child psychologists about the differential effects of risk factors on children's outcomes.
The National Sample Survey Organisation (NSSO) surveys are the main source of official statistics in India and generate a range of invaluable data at the macro level (e.g. state and national level). However, the NSSO data cannot be used directly to produce reliable estimates at the micro level (e.g. district or further disaggregate level) due to small sample sizes. There is a rapidly growing demand of such micro level statistics in India as the country is moving from centralized to more decentralized planning system. In this article we employ small area estimation (SAE) techniques to derive model-based estimates of proportion of indebted households at district or at other small area levels in the State of not possible to produce estimates using sample data alone. The model based estimates generated using SAE are still reliable for such areas. The estimates are expected to provide invaluable information to policy-analysts and decision-makers.
Viraemia persistently ≤20 000-IU/mL predicts a benign clinical outcome: it was associated with transition to IC in 43% of LV-AC and to Occult HBV Infection in 20% of IC within 5-years. Nevertheless, 13.1% of individuals with low viraemia at presentation develops CHB within 1 year: 1-year HBV-DNA monitoring resulted the most accurate diagnostic approach that can be limited to at least a half of cases by the single point HBV-DNA/HBsAg quantification. The IC-diagnostic-accuracy combining HBV-DNA/total-anti-HBc/HBcrAg needs to be confirmed in further studies.
We introduce a semi-parametric approach to ecological regression for disease mapping, based on modelling the regression M-quantiles of a Negative Binomial variable. The proposed method is robust to outliers in the model covariates, including those due to measurement error, and can account for both spatial heterogeneity and spatial clustering. A simulation experiment based on the well-known Scottish lip cancer data set is used to compare the M-quantile modelling approach and a random effects modelling approach for disease mapping. This suggests that the Mquantile approach leads to predicted relative risks with smaller root mean square error than standard disease mapping methods. The paper concludes with an illustrative application of the M-quantile approach, mapping low birth weight incidence data for English Local Authority Districts for the years 2005-2010.
The timely, accurate monitoring of social indicators, such as poverty or inequality, on a finegrained spatial and temporal scale is a crucial tool for understanding social phenomena and policymaking, but poses a great challenge to official statistics. This article argues that an interdisciplinary approach, combining the body of statistical research in small area estimation with the body of research in social data mining based on Big Data, can provide novel means to tackle this problem successfully. Big Data derived from the digital crumbs that humans leave behind in their daily activities are in fact providing ever more accurate proxies of social life. Social data mining from these data, coupled with advanced model-based techniques for fine-grained estimates, have the potential to provide a novel microscope through which to view and understand social complexity. This article suggests three ways to use Big Data together with small area estimation techniques, and shows how Big Data has the potential to mirror aspects of well-being and other socioeconomic phenomena.
In this paper we define a finite mixture of quan- tile and M-quantile regression models for heterogeneous and /or for dependent/clustered data. Components of the finite mixture represent clusters of individuals with homogeneous values of model parameters. For its flexibility and ease of estimation, the proposed approaches can be extended to ran- dom coefficients with a higher dimension than the simple random intercept case. Estimation of model parameters is obtained through maximum likelihood, by implementing an EM-type algorithm. The standard error estimates for model parameters are obtained using the inverse of the observed information matrix, derived through the Oakes (J R Stat Soc Ser B 61:479–482, 1999) formula in the M-quantile setting, and through nonparametric bootstrap in the quantile case. We present a large scale simulation study to analyse the practical behaviour of the proposed model and to evaluate the empiri- cal performance of the proposed standard error estimates for model parameters. We considered a variety of empirical set- tings in both the random intercept and the random coefficient case. The proposed modelling approaches are also applied to two well-known datasets which give further insights on their empirical behaviour
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.