Abstract:The Pareto distribution is often used in many areas of economics to model the right tail of heavy-tailed distributions. However, the standard method of estimating the shape parameter (the Pareto tail index) of this distribution-the maximum likelihood estimator (MLE), also known as the Hill estimator-is non-robust, in the sense that it is very sensitive to extreme observations, data contamination or model deviation. In recent years, a number of robust estimators for the Pareto tail index have been proposed, whi… Show more
“…In fact, the resulting estimator ξ k 0 ,k (n), where k 0 is automatically selected, has an excellent finite sample performance and it is adaptively robust. This novel adaptive robustness property is not present in other robust estimators of [20,23,28,13,31,14], which involve hard to select tuning parameters. Also none of these estimators is able to identify outliers in the extremes, a property inherent to the adaptive trimmed Hill estimator.…”
We introduce a trimmed version of the Hill estimator for the index of a heavy-tailed distribution, which is robust to perturbations in the extreme order statistics. In the ideal Pareto setting, the estimator is essentially finite-sample efficient among all unbiased estimators with a given strict upper break-down point. For general heavy-tailed models, we establish the asymptotic normality of the estimator under second order regular variation conditions and also show it is minimax rateoptimal in the Hall class of distributions. We also develop an automatic, data-driven method for the choice of the trimming parameter which yields a new type of robust estimator that can adapt to the unknown level of contamination in the extremes. This adaptive robustness property makes our estimator particularly appealing and superior to other robust estimators in the setting where the extremes of the data are contaminated. As an important application of the data-driven selection of the trimming parameters, we obtain a methodology for the principled identification of extreme outliers in heavy tailed data. Indeed, the method has been shown to correctly identify the number of outliers in the previously explored Condroz data set.
“…In fact, the resulting estimator ξ k 0 ,k (n), where k 0 is automatically selected, has an excellent finite sample performance and it is adaptively robust. This novel adaptive robustness property is not present in other robust estimators of [20,23,28,13,31,14], which involve hard to select tuning parameters. Also none of these estimators is able to identify outliers in the extremes, a property inherent to the adaptive trimmed Hill estimator.…”
We introduce a trimmed version of the Hill estimator for the index of a heavy-tailed distribution, which is robust to perturbations in the extreme order statistics. In the ideal Pareto setting, the estimator is essentially finite-sample efficient among all unbiased estimators with a given strict upper break-down point. For general heavy-tailed models, we establish the asymptotic normality of the estimator under second order regular variation conditions and also show it is minimax rateoptimal in the Hall class of distributions. We also develop an automatic, data-driven method for the choice of the trimming parameter which yields a new type of robust estimator that can adapt to the unknown level of contamination in the extremes. This adaptive robustness property makes our estimator particularly appealing and superior to other robust estimators in the setting where the extremes of the data are contaminated. As an important application of the data-driven selection of the trimming parameters, we obtain a methodology for the principled identification of extreme outliers in heavy tailed data. Indeed, the method has been shown to correctly identify the number of outliers in the previously explored Condroz data set.
“…Moreover, to overcome the problem of outliers and influential observations, we recalibrate sample weights following the approach proposed by Alfons et al . (2013) and generally adopted by those working with income variables (Alfons and Templ, 2013; Brzesinki, 2016; Jenkins, 2017; Safari et al ., 2018, 2019; Templ et al ., 2019). This procedure consists of detecting outlier observations against a fitted Pareto distribution of the variable of interest, applying Van Kerm’s rule of thumb to determine the threshold (Van Kerm, 2007).…”
Government interventions in the agricultural sector have been historically justified by the existence of an income disparity between farmers and non-farmers. However, recent studies have found that such disparity is disappearing over time, particularly in the United States. This work offers the first longitudinal systematic assessment on the average income disparity between farm and non-farm units in the European Union, differentiating between old and new Member States. Using the EU-SILC dataset, both broad (having some farm income) and narrow (living mainly on agriculture) farm households are compared with a general sample of non-farm households and a more restricted sample of self-employed non-farm households. To control for household observable characteristics and time-constant unobserved factors, we use a fixed effects regression. Results suggest that the farm/non-farm income disparity has disappeared in the European Union unless we compare narrow farm households with all non-farm households: in this case, the former are more likely to be better off than the latter. A limited income disparity is found only in the case of new Member States for broad farm households only. Results are used to draw policy implications regarding the role of CAP in supporting farm income.
“…However, those data have been treated in order to circumvent non-robustness problems. The issue of robust estimation of economic indicators based on a semi-parametric Pareto upper tail model is well-established in literature see [Brzezinski(2016)] for a review and [Alfons et al(2013)Alfons, Templ, and Filzmoser] for a specification suitable for survey data. On the contrary, the issue of robust treatment of outlier in the lower tail of income distribution appears less established, see [Van Kerm(2007)], [Masseran et al(2019)Masseran, Safari, and Ibrahim].…”
Section: Design-based Simulation On Bias Correctionmentioning
confidence: 99%
“…On the contrary, the issue of robust treatment of outlier in the lower tail of income distribution appears less established, see [Van Kerm(2007)], [Masseran et al(2019)Masseran, Safari, and Ibrahim]. As regards the upper tail we operated a semiparametric Pareto-tail modeling procedure using the Probability Integral Transform Statistic Estimator (PITSE) proposed by [Finkelstein et al(2006)Finkelstein, Tucker, and Alan Veeh], which blends very good performances in small samples and a fast computational implementation, as suggested by [Brzezinski(2016)]. As regards the lower tail extreme value treatment, we used an inverse Pareto modification of PITSE estimator, suggested by [Masseran et al(2019)Masseran, Safari, and Ibrahim].…”
Section: Design-based Simulation On Bias Correctionmentioning
Income inequality measures are biased in small samples leading generally to an underestimation. After investigating the nature of the bias, we propose a biascorrection framework for a large class of inequality measures comprising Gini Index, Generalized Entropy and Atkinson families by accounting for complex survey designs. The proposed methodology is based on Taylor's expansions and generalized linearization method, and does not require any parametric assumption on income distribution, being very flexible. Design-based performance evaluation of the suggested correction has been carried out using data taken from EU-SILC survey. Results show a noticeable bias reduction for all measures. A bootstrap variance estimation proposal and a distributional analysis follow in order to provide a comprehensive overview of the behavior of inequality estimators in small samples. Results about estimators distributions show increasing positive skewness and lepto-kurtosis at decreasing sample sizes, confirming the non-applicability of the classical asymptotic results in small samples and suggesting the development of alternative methods of inference.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.