2021
DOI: 10.48550/arxiv.2110.06948
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Challenges for Unsupervised Anomaly Detection in Particle Physics

Katherine Fraser,
Samuel Homiller,
Rashmish K. Mishra
et al.

Abstract: Anomaly detection relies on designing a score to determine whether a particular event is uncharacteristic of a given background distribution. One way to define a score is to use autoencoders, which rely on the ability to reconstruct certain types of data (background) but not others (signals). In this paper, we study some challenges associated with variational autoencoders, such as the dependence on hyperparameters and the metric used, in the context of anomalous signal (top and W ) jets in a QCD background. We… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
19
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 8 publications
(21 citation statements)
references
References 40 publications
2
19
0
Order By: Relevance
“…In spite of this wealth of possible practical applications, the fundamental question still needs to be studied, namely what defines an anomaly search at the LHC? For large and stochastic datasets, the concept of outliers is not defined, because any LHC jet or event configuration will occur with a finite probability, especially after we include detector imperfections [51][52][53][54][55]. In this situation, a simple, working definition of anomalous data is an event which lies in a low-density phase space region.…”
Section: What Is Anomalous?mentioning
confidence: 99%
See 1 more Smart Citation
“…In spite of this wealth of possible practical applications, the fundamental question still needs to be studied, namely what defines an anomaly search at the LHC? For large and stochastic datasets, the concept of outliers is not defined, because any LHC jet or event configuration will occur with a finite probability, especially after we include detector imperfections [51][52][53][54][55]. In this situation, a simple, working definition of anomalous data is an event which lies in a low-density phase space region.…”
Section: What Is Anomalous?mentioning
confidence: 99%
“…A similar approach was discussed in Ref. [55], where instead of k-means clustering a k-medoids algorithm was used to obtain the representatives of the background dataset. The MinD score assumes that regular datapoints are close to the k-means centroids, whereas outliers are not.…”
Section: K-nearest Centroidsmentioning
confidence: 99%
“…Since the initial proposals of using autoencoders for anomaly detection [40][41][42], a number of improvements and modifications have been suggested. An important observation is that autoencoders can be biased by the relative complexity of anomalous and background data, potentially leading to outliers with a lower loss than the background [129][130][131]. As the latent space in VAEs [33,34] is optimised to follow a known distribution for backgrounds, it can also be used as anomaly score [132,133].…”
Section: Unsupervisedmentioning
confidence: 99%
“…Examples of OCC can be found in an extensive literature review provided by [1]. As for AD, there is a growing number of applications that span from accelerator operations to physics analyses, the latter being of great interest for example at the Large Hadron Collider (LHC) since new physics beyond Standard Model (BSM) remains elusive (as discussed, e.g., in [2][3][4][5]). 1 In both cases, one typically deals with multiple features that vary as a function of the phase space of the final state particles reconstructed in the detector.…”
Section: Introductionmentioning
confidence: 99%