We describe the outcome of a data challenge conducted as part of the Dark Machines (https://www.darkmachines.org) initiative and the Les Houches 2019 workshop on Physics at TeV colliders. The challenged aims to detect signals of new physics at the Large Hadron Collider (LHC) using unsupervised machine learning algorithms. First, we propose how an anomaly score could be implemented to define model-independent signal regions in LHC searches. We define and describe a large benchmark dataset, consisting of >1 billion simulated LHC events corresponding to 10\, fb^{-1}10fb−1 of proton-proton collisions at a center-of-mass energy of 13 TeV. We then review a wide range of anomaly detection and density estimation algorithms, developed in the context of the data challenge, and we measure their performance in a set of realistic analysis environments. We draw a number of useful conclusions that will aid the development of unsupervised new physics searches during the third run of the LHC, and provide our benchmark dataset for future studies at https://www.phenoMLdata.org. Code to reproduce the analysis is provided at https://github.com/bostdiek/DarkMachines-UnsupervisedChallenge.
Observations with the Fermi Large Area Telescope (LAT) indicate an excess in gamma rays originating from the center of our Galaxy. A possible explanation for this excess is the annihilation of Dark Matter particles. We have investigated the annihilation of neutralinos as Dark Matter candidates within the phenomenological Minimal Supersymmetric Standard Model (pMSSM). An iterative particle filter approach was used to search for solutions within the pMSSM. We found solutions that are consistent with astroparticle physics and collider experiments, and provide a fit to the energy spectrum of the excess. The neutralino is a Bino/Higgsino or Bino/Wino/Higgsino mixture with a mass in the range 84 − 92 GeV or 87 − 97 GeV annihilating into W bosons. A third solutions is found for a neutralino of mass 174 − 187 GeV annihilating into top quarks. The best solutions yield a Dark Matter relic density 0.06 < Ωh 2 < 0.13. These pMSSM solutions make clear forecasts for LHC, direct and indirect DM detection experiments. If the pMSSM explanation of the excess seen by Fermi-LAT is correct, a DM signal might be discovered soon.
Simulating nature and in particular processes in particle physics require expensive computations and sometimes would take much longer than scientists can afford. Here, we explore ways to a solution for this problem by investigating recent advances in generative modeling and present a study for the generation of events from a physical process with deep generative models. The simulation of physical processes requires not only the production of physical events, but to also ensure that these events occur with the correct frequencies. We investigate the feasibility of learning the event generation and the frequency of occurrence with several generative machine learning models to produce events like Monte Carlo generators. We study three processes: a simple two-body decay, the processes e+e− → Z → l+l− and $$pp\to t\bar{t}$$ p p → t t ¯ including the decay of the top quarks and a simulation of the detector response. By buffering density information of encoded Monte Carlo events given the encoder of a Variational Autoencoder we are able to construct a prior for the sampling of new events from the decoder that yields distributions that are in very good agreement with real Monte Carlo events and are generated several orders of magnitude faster. Applications of this work include generic density estimation and sampling, targeted event generation via a principal component analysis of encoded ground truth data, anomaly detection and more efficient importance sampling, e.g., for the phase space integration of matrix elements in quantum field theories.
We present a study for the generation of events from a physical process with deep generative models. The simulation of physical processes requires not only the production of physical events, but also to ensure these events occur with the correct frequencies. We investigate the feasibility of learning the event generation and the frequency of occurrence with Generative Adversarial Networks (GANs) and Variational Autoencoders (VAEs) to produce events like Monte Carlo generators. We study three processes: a simple two-body decay, the processes e + e − → Z → l + l − and pp → t t including the decay of the top quarks and a simulation of the detector response. We find that the tested GAN architectures and the standard VAE are not able to learn the distributions precisely. By buffering density information of encoded Monte Carlo events given the encoder of a VAE we are able to construct a prior for the sampling of new events from the decoder that yields distributions that are in very good agreement with real Monte Carlo events and are generated several orders of magnitude faster. Applications of this work include generic density estimation and sampling, targeted event generation via a principal component analysis of encoded ground truth data, anomaly detection and more efficient importance sampling, e.g. for the phase space integration of matrix elements in quantum field theories.
The Fermi Collaboration has recently updated their analysis of gamma rays from 7 See ref.[52] for more information.
We present the application of convolutional neural networks to a particular problem in gamma ray astronomy. Explicitly, we use this method to investigate the origin of an excess emission of GeV γ rays in the direction of the Galactic Center, reported by several groups by analyzing Fermi-LAT data. Interpretations of this excess include γ rays created by the annihilation of dark matter particles and γ rays originating from a collection of unresolved point sources, such as millisecond pulsars. We train and test convolutional neural networks with simulated Fermi-LAT images based on point and diffuse emission models of the Galactic Center tuned to measured γ ray data. Our new method allows precise measurements of the contribution and properties of an unresolved population of γ ray point sources in the interstellar diffuse emission model. The current model predicts the fraction of unresolved point sources with an error of up to 10% and this is expected to decrease with future work. arXiv:1708.06706v2 [astro-ph.HE]
We propose a new method to define anomaly scores and apply this to particle physics collider events. Anomalies can be either rare, meaning that these events are a minority in the normal dataset, or different, meaning they have values that are not inside the dataset. We quantify these two properties using an ensemble of One-Class Deep Support Vector Data Description models, which quantifies differentness, and an autoregressive flow model, which quantifies rareness. These two parameters are then combined into a single anomaly score using different combination algorithms. We train the models using a dataset containing only simulated collisions from the Standard Model of particle physics and test it using various hypothetical signals in four different channels and a secret dataset where the signals are unknown to us. The anomaly detection method described here has been evaluated in a summary paper where it performed very well compared to a large number of other methods. The method is simple to implement and is applicable to other datasets in other fields as well.
We develop a novel method based on machine learning principles to achieve optimal initiation of CPU-intensive computations for forward asteroseismic modeling in a multi-D parameter space. A deep neural network is trained on a precomputed asteroseismology grid containing about 62 million coherent oscillation-mode frequencies derived from stellar evolution models. These models are representative of the core-hydrogen burning stage of intermediate-mass and high-mass stars. The evolution models constitute a 6D parameter space and their predicted low-degree pressure-and gravity-mode oscillations are scanned, using a genetic algorithm. A software pipeline is created to find the best fitting stellar parameters for a given set of observed oscillation frequencies. The proposed method finds the optimal regions in the 6D parameters space in less than a minute, hence providing the optimal starting point for further and more detailed forward asteroseismic modeling in a high-dimensional context. We test and apply the method to seven pulsating stars that were previously modeled asteroseismically by classical grid-based forward modeling based on a χ 2 statistic and obtain good agreement with past results. Our deep learning methodology opens up the application of asteroseismic modeling in +6D parameter space for thousands of stars pulsating in coherent modes with long lifetimes observed by the Kepler space telescope and to be discovered with the TESS and PLATO space missions, while applications so far were done star-by-star for only a handful of cases. Our method is open source and can be used by anyone freely a)
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.