Based on the established task of identifying boosted, hadronically decaying top quarks, we compare a wide range of modern machine learning approaches. Unlike most established methods they rely on low-level input, for instance calorimeter output. While their network architectures are vastly different, their performance is comparatively similar. In general, we find that these new approaches are extremely powerful and great fun.
Autoencoders as tools behind anomaly searches at the LHC have the
structural problem that they only work in one direction, extracting
jets with higher complexity but not the other way around. To
address this, we derive classifiers from the latent space of
(variational) autoencoders, specifically in Gaussian mixture and
Dirichlet latent spaces. In particular, the Dirichlet setup solves the
problem and improves both the performance and the interpretability
of the networks.
Searches for anomalies are the main motivation for the LHC and define key analysis steps, including triggers. We discuss how LHC anomalies can be defined through probability density estimates, evaluated in a physics space or in an appropriate neural network latent space. We illustrate this for classical k-means clustering, a Dirichlet variational autoencoder, and invertible neural networks. For two especially challenging scenarios of jets from a dark sector we evaluate the strengths and limitations of each method.
Collider searches face the challenge of defining a representation of
high-dimensional data such that physical symmetries are manifest, the discriminating
features are retained, and the choice of representation is new-physics agnostic.
We introduce JetCLR to solve the mapping from low-level data to optimized observables
through self-supervised contrastive learning. As an example, we construct a data
representation for top and QCD jets using a permutation-invariant transformer-encoder
network and visualize its symmetry properties. We compare the JetCLR representation with
alternative representations using linear classifier tests and find it to work quite well.
Collider searches face the challenge of defining a representation of high-dimensional data such that physical symmetries are manifest, the discriminating features are retained, and the choice of representation is new-physics agnostic. We introduce JetCLR to solve the mapping from low-level data to optimized observables though self-supervised contrastive learning. As an example, we construct a data representation for top and QCD jets using a permutation-invariant transformer-encoder network and visualize its symmetry properties. We compare the JetCLR representation with alternative representations using linear classifier tests and find it to work quite well.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.