1992
DOI: 10.1111/j.2517-6161.1992.tb01449.x
|View full text |Cite
|
Sign up to set email alerts
|

Identifying Multiple Outliers in Multivariate Data

Abstract: SUMMARY We propose a procedure for the detection of multiple outliers in multivariate data. Let X be an n × p data matrix representing n observations on p variates. We first order the n observations, using an appropriately chosen robust measure of outlyingness, then divide the data set into two initial subsets: A ‘basic’ subset which contains p +1 ‘good’ observations and a ‘non‐basic’ subset which contains the remaining n ‐ p ‐ 1 observations. Second, we compute the relative distance from each point in the dat… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
359
0
29

Year Published

1993
1993
2016
2016

Publication Types

Select...
6
3

Relationship

0
9

Authors

Journals

citations
Cited by 452 publications
(390 citation statements)
references
References 7 publications
2
359
0
29
Order By: Relevance
“…This indicates that the linking of the individual and firm data is incomplete. Second, we remove some potentially influential outliers that we detected by using the method proposed by Hadi (1992Hadi ( , 1994. The method is useful for finding multiple outliers in multivariate data.…”
Section: Data and Variablesmentioning
confidence: 99%
“…This indicates that the linking of the individual and firm data is incomplete. Second, we remove some potentially influential outliers that we detected by using the method proposed by Hadi (1992Hadi ( , 1994. The method is useful for finding multiple outliers in multivariate data.…”
Section: Data and Variablesmentioning
confidence: 99%
“…32 Finally, we consider the effect of eliminating the outliers on the results of the estimation. Following Roodman (2007) and Easterly et al (2004), outliers are chosen by applying the Hadi (1992) procedure, using 0.05 as the cut-off significant level. Table 12 presents the results of different estimation procedure in the cross section sample once the outliers are eliminated.…”
Section: Panel Data Estimation With Lagged Dependent Variablementioning
confidence: 99%
“…Outliers will only be included as m approaches n, when no good observations remain to be introduced into the fit. Hadi (1992) uses the same forward algorithm starting from robust estimates of the means and covariances for calculation of the initial Mahalanobis distances. His forward search terminates when m is the median of the number of observations when allowance is made for the effect of fitting.…”
Section: The Forward Identification Of Outliers Using Mahalanobis Dismentioning
confidence: 99%