2020
DOI: 10.1016/j.chemolab.2020.103957
|View full text |Cite
|
Sign up to set email alerts
|

Real-time outlier detection for large datasets by RT-DetMCD

Abstract: Modern industrial machines can generate gigabytes of data in seconds, frequently pushing the boundaries of available computing power. Together with the time criticality of industrial processing this presents a challenging problem for any data analytics procedure. We focus on the deterministic minimum covariance determinant method (DetMCD), which detects outliers by fitting a robust covariance matrix. We construct a much faster version of DetMCD by replacing its initial estimators by two new methods and incorpo… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
1
1

Relationship

2
5

Authors

Journals

citations
Cited by 14 publications
(4 citation statements)
references
References 23 publications
(34 reference statements)
0
4
0
Order By: Relevance
“…Finally, point contamination places all outliers in the point μ C so they behave like a tight cluster. These settings make the simulation consistent with those in Boudt et al (2020), Hubert et al (2012) andDe Ketelaere et al (2020). The deviation of an estimated scatter matrixˆ relative to the true covariance matrix is measured by the Kullback-Leibler (KL) divergence KL(ˆ , ) = trace(ˆ −1 ) − log(det(ˆ −1 )) − p. The speedup factor is measured as speedup = time(MRCD)/time(KMRCD).…”
Section: Simulation Study With Linear Kernelmentioning
confidence: 99%
See 1 more Smart Citation
“…Finally, point contamination places all outliers in the point μ C so they behave like a tight cluster. These settings make the simulation consistent with those in Boudt et al (2020), Hubert et al (2012) andDe Ketelaere et al (2020). The deviation of an estimated scatter matrixˆ relative to the true covariance matrix is measured by the Kullback-Leibler (KL) divergence KL(ˆ , ) = trace(ˆ −1 ) − log(det(ˆ −1 )) − p. The speedup factor is measured as speedup = time(MRCD)/time(KMRCD).…”
Section: Simulation Study With Linear Kernelmentioning
confidence: 99%
“…Computing the MCD was difficult at first but became faster with the algorithm of Rousseeuw and Van Driessen (1999) and the deterministic algorithm DetMCD (Hubert et al 2012). An algorithm for n in the millions was recently constructed (De Ketelaere et al 2020). But all algorithms for the original MCD require that the dimension p be lower than h in order to obtain an invertible covariance matrix.…”
Section: Introductionmentioning
confidence: 99%
“…It has became clear that all agents involved in a given stock market may exhibit interconnections and correlations, representing important internal forces of the market (Collins & Biekpe, 2003;Jizba et al, 2012) -that is, the movement of a stock market in a country is likely to be affected by movement of other stocks in both that country and in other regions (Masih & Masih, 2001). The following strategies have been proposed to identify and quantify interactions on this type of complex system (Greenblatt et al, 2012): i) spacetime, such as covariance (Wang & Ye, 2016;De Ketelaere et al, 2018), correlation (Kenett et al, 2015), Granger causality (Papana et al, 2017), Shannon entropy (Sulthan et al, 2016), mutual information (Wang & Hui, 2017), and Renyi entropy (Brody et al, 2007); ii) space-frequency and space-time-frequency, such as Fourier transform (Fang & Chang, 2017;Saia et al, 2017), coherence (Vacha & Barunik, 2012), phase synchronization (Radhakrishnan et al, 2016), directed transfer function (Kamiński et al, 2001), wavelet transform (Joo & Kim, 2015;Saia, 2017), and cross-time frequency measures (Loh, 2013). The previous works study how the price of one stock is influenced by the economic factors of other markets.…”
Section: Interdependency Between Stock Marketsmentioning
confidence: 99%
“…To cater to its broad applications, extensive effort has been made to improve the computational efficiency of the approximation algorithm. For example, Rousseeuw and Driessen (1999) propose the first computationally efficient algorithm, termed FASTMCD; (Hubert et al, 2012) suggest an improved version of FASTMCD, termed DetMCD; De Ketelaere et al (2020) accelerates DetMCD by refinement of the calculation steps and parallel computation. Furthermore, Boudt et al (2020) generalizes the MCD to high-dimensional cases as the minimum regularized covariance determinant (MRCD).…”
Section: Introductionmentioning
confidence: 99%