XGBOD: Improving Supervised Outlier Detection with Unsupervised Representation Learning

Zhao, Yue; Hryniewicki, Maciej K.

doi:10.1109/ijcnn.2018.8489605

Cited by 85 publications

(65 citation statements)

References 24 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In this work, for multivariate data, we have compared the methodologies proposed with some multivariate outlier detection techniques. In the future, systematic experiments comparing with other well known methodologies such as XBGOD [29], LODES [30], iForest [31] or MASS [32] are to be carried out. Regarding these multivariate techniques, another interesting research line is the extension of such methodologies to functional data analysis.…”

Section: Discussionmentioning

confidence: 99%

Combining Entropy Measures for Anomaly Detection

Muñoz

Hernández

Moguerza

et al. 2018

Entropy

View full text Add to dashboard Cite

The combination of different sources of information is a problem that arises in several situations, for instance, when data are analysed using different similarity measures. Often, each source of information is given as a similarity, distance, or a kernel matrix. In this paper, we propose a new class of methods which consists of producing, for anomaly detection purposes, a single Mercer kernel (that acts as a similarity measure) from a set of local entropy kernels and, at the same time, avoids the task of model selection. This kernel is used to build an embedding of data in a variety that will allow the use of a (modified) one-class Support Vector Machine to detect outliers. We study several information combination schemes and their limiting behaviour when the data sample size increases within an Information Geometry context. In particular, we study the variety of the given positive definite kernel matrices to obtain the desired kernel combination as belonging to that variety. The proposed methodology has been evaluated on several real and artificial problems.

show abstract

Section: Discussionmentioning

confidence: 99%

Combining Entropy Measures for Anomaly Detection

Muñoz

Hernández

Moguerza

et al. 2018

Entropy

View full text Add to dashboard Cite

show abstract

“…While it is possible to experimentally determine an optimal k with crossvalidation [16] when ground truth is available, a similar trivial approach does not exist in an unsupervised setting. For these reasons, we recommend setting k = 0.1n, 10% of the training samples, bounded in the range of [30,100], which yielded good results in practice.…”

Section: Local Region Definition the Local Regionmentioning

confidence: 99%

LSCP: Locally Selective Combination in Parallel Outlier Ensembles

Zhao¹,

Nasrullah²,

Hryniewicki³

et al. 2019

Proceedings of the 2019 SIAM International Conference on Data Mining

Self Cite

105

View full text Add to dashboard Cite

In unsupervised outlier ensembles, the absence of ground truth makes the combination of base outlier detectors a challenging task. Specifically, existing parallel outlier ensembles lack a reliable way of selecting competent base detectors, affecting accuracy and stability, during model combination. In this paper, we propose a framework-called Locally Selective Combination in Parallel Outlier Ensembles (LSCP)-which addresses the issue by defining a local region around a test instance using the consensus of its nearest neighbors in randomly selected feature subspaces. The top-performing base detectors in this local region are selected and combined as the model's final output. Four variants of the LSCP framework are compared with seven widely used parallel frameworks. Experimental results demonstrate that one of these variants, LSCP AOM, consistently outperforms baselines on the majority of twenty real-world datasets.

show abstract

“…Classification can be performed at the individual frame level, where each frame is treated as an independent sample, or at the song level where the goal is to classify the artist corresponding to a particular song using multiple samples. The latter can be interpreted as a form of ensembling where aggregating frame level predictions and voting up to the song level can yield variance reduction; this has been an effective approach in various interdisciplinary machine learning studies [26]- [28].…”

Section: Frame Level Versus Song Level Evaluationmentioning

confidence: 99%

Music Artist Classification with Convolutional Recurrent Neural Networks

Nasrullah

Zhao

2019

2019 International Joint Conference on Neural Networks (IJCNN)

Self Cite

View full text Add to dashboard Cite

Previous attempts at music artist classification use frame level audio features which summarize frequency content within short intervals of time. Comparatively, more recent music information retrieval tasks take advantage of temporal structure in audio spectrograms using deep convolutional and recurrent models. This paper revisits artist classification with this new framework and empirically explores the impacts of incorporating temporal structure in the feature representation. To this end, an established classification architecture, a Convolutional Recurrent Neural Network (CRNN), is applied to the artist20 music artist identification dataset under a comprehensive set of conditions. These include audio clip length, which is a novel contribution in this work, and previously identified considerations such as dataset split and feature level. Our results improve upon baseline works, verify the influence of the producer effect on classification performance and demonstrate the trade-offs between audio length and training set size. The best performing model achieves an average F1 score of 0.937 across three independent trials which is a substantial improvement over the corresponding baseline under similar conditions. Additionally, to showcase the effectiveness of the CRNN's feature extraction capabilities, we visualize audio samples at the model's bottleneck layer demonstrating that learned representations segment into clusters belonging to their respective artists.

show abstract

XGBOD: Improving Supervised Outlier Detection with Unsupervised Representation Learning

Cited by 85 publications

References 24 publications

Combining Entropy Measures for Anomaly Detection

Combining Entropy Measures for Anomaly Detection

LSCP: Locally Selective Combination in Parallel Outlier Ensembles

Music Artist Classification with Convolutional Recurrent Neural Networks

Contact Info

Product

Resources

About