2021
DOI: 10.48550/arxiv.2109.10919
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

An Exploration of Learnt Representations of W Jets

Abstract: I present a Variational Autoencoder (VAE) trained on collider physics data (specifically boosted W jets), with reconstruction error given by an approximation to the Earth Movers Distance (EMD) between input and output jets. This VAE learns a concrete representation of the data manifold, with semantically meaningful and interpretable latent space directions which are hierarchically organized in terms of their relation to physical EMD scales in the underlying physical generative process. A hyperparameter β contr… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
5
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(5 citation statements)
references
References 20 publications
0
5
0
Order By: Relevance
“…In order to understand what a VAE is learning, we study its latent space. In particular, we look at distance between events in VAE latent space (see [48,63] for other studies of VAE latent spaces in particle physics). Since we can think of the VAE anomaly score as a "distance" encoding how far any given event is from the background distribution, it is also natural to ask about the distances between individual events.…”
mentioning
confidence: 99%
“…In order to understand what a VAE is learning, we study its latent space. In particular, we look at distance between events in VAE latent space (see [48,63] for other studies of VAE latent spaces in particle physics). Since we can think of the VAE anomaly score as a "distance" encoding how far any given event is from the background distribution, it is also natural to ask about the distances between individual events.…”
mentioning
confidence: 99%
“…In order to understand what a VAE is learning, we study its latent space. In particular, we look at the distance between events in VAE latent space (see [48,64] for other studies of VAE latent spaces in particle physics). Since we can think of the VAE anomaly score as a "distance" encoding how far any given event is from the background distribution, it is also natural to ask about the distances between individual events.…”
Section: Jhep03(2022)066mentioning
confidence: 99%
“…[27] finds the optimal size to be around 20-34 for their top-tagging data. We chose the latent space size by preforming a small scan over latent dimension sizes [2,4,8,16,32,64] on the "The Machine Learning Landscape of Top Taggers" data [82] and found that the larger latent space yielded better top tagging. We then changed to the current data set, as there are more signals to consider (i.e.…”
Section: Autoencodersmentioning
confidence: 99%
“…( 8), computing the linearized distance between all pairs of events E and Ẽ using pluOT only requires O(N 2 evt ) computations of the weighted Euclidean metric; see eq. (12). Given that computing a weighted Euclidean metric is several orders of magnitude faster than computing the Hellinger-Kantorovich metric, our approach using pluOT offers a substantial computational advantage compared to computing the exact Hellinger-Kantorovich distance between all pairs of events.…”
Section: Linearized Hellinger-kantorovich Metricmentioning
confidence: 99%
“…In [2], the EMD formed the foundation of a unified approach to collider observables based on the distance between an event and a manifold on the space of events, recasting decades of collider physics in the language of optimal transport. Optimal transport also un-derlies a number of recently-developed machine learning frameworks for anomaly detection and event generation at the LHC [7][8][9][10][11][12].…”
Section: Introductionmentioning
confidence: 99%