The dropout learning algorithm

Baldi, Pierre; Sadowski, Peter

doi:10.1016/j.artint.2014.02.004

Cited by 318 publications

(252 citation statements)

References 35 publications

Supporting

Mentioning

227

Contrasting

Order By: Relevance

“…Learning rate decayed linearly from 0.01 to a final value starting and finishing at a specified number of epochs. Dropout (in which nodes are removed during training) with values of p from 0.0 to 0.5 were used at several combinations of layers to add regularization [37,38]. These networks had 9 fully connected hidden layers with rectified linear units [39,40].…”

Section: Feedforward Neural Networkmentioning

confidence: 99%

Jet flavor classification in high-energy physics with deep neural networks

et al. 2016

Self Cite

View full text Add to dashboard Cite

Classification of jets as originating from light-flavor or heavy-flavor quarks is an important task for inferring the nature of particles produced in high-energy collisions. The large and variable dimensionality of the data provided by the tracking detectors makes this task difficult. The current state-of-the-art tools require expert data-reduction to convert the data into a fixed low-dimensional form that can be effectively managed by shallow classifiers. We study the application of deep networks to this task, attempting classification at several levels of data, starting from a raw list of tracks. We find that the highest-level lowest-dimensionality expert information sacrifices information needed for classification, that the performance of current state-of-the-art taggers can be matched or slightly exceeded by deep-network-based taggers using only track and vertex information, that classification using only lowest-level highest-dimensionality tracking information remains a difficult task for deep networks, and that adding lower-level track and vertex information to the classifiers provides a significant boost in performance compared to the state-of-the-art.

show abstract

Section: Feedforward Neural Networkmentioning

confidence: 99%

Jet flavor classification in high-energy physics with deep neural networks

et al. 2016

Self Cite

View full text Add to dashboard Cite

show abstract

“…Dropout prevents neutrons from coadapting by randomly setting a fraction, governed by the dropout hyperparameter, to zero at each training iteration. This results in a model that can be interpreted as randomly sampling from an exponential number of similar networks [64], and creates more generalizable representations of data.…”

Section: Appendix A: Neural Network Glossarymentioning

confidence: 99%

“…A tanh activation function was applied to the nodes in each layer, as well as an L2 regularizer with weight decay set to 0.001. Dropout [62][63][64] was also applied to each layer. While many variations of the network structure were investigated, a systematic hyperparameter tuning was not undertaken due to computational limitations.…”

Section: A Fully-connected Networkmentioning

confidence: 99%

Machine learning action parameters in lattice quantum chromodynamics

2018

View full text Add to dashboard Cite

Numerical lattice quantum chromodynamics studies of the strong interaction are important in many aspects of particle and nuclear physics. Such studies require significant computing resources to undertake. A number of proposed methods promise improved efficiency of lattice calculations, and access to regions of parameter space that are currently computationally intractable, via multi-scale action-matching approaches that necessitate parametric regression of generated lattice datasets. The applicability of machine learning to this regression task is investigated, with deep neural networks found to provide an efficient solution even in cases where approaches such as principal component analysis fail. The high information content and complex symmetries inherent in lattice QCD datasets require custom neural network layers to be introduced and present opportunities for further development.

show abstract

“…While running stochastic gradient descent, dropout involves randomly sampling the set of features that are considered in any step of the algorithm. However, in a contrast to randomized splitting in a forest, the effect of dropout training is fairly well understood: for example, in the case of single-layer models, dropout can be understood as a form of data-adaptive ridge-like regularization (Baldi and Sadowski 2014;Wager et al 2013). Fleshing out the connections between dropout and random forests could provide new insights about the role of feature sampling in growing trees.…”

Section: Why Is Feature Sampling Helpful?mentioning

confidence: 99%

Comments on: A random forest guided tour

Wager

2016

TEST

View full text Add to dashboard Cite

The dropout learning algorithm

Cited by 318 publications

References 35 publications

Jet flavor classification in high-energy physics with deep neural networks

Jet flavor classification in high-energy physics with deep neural networks

Machine learning action parameters in lattice quantum chromodynamics

Comments on: A random forest guided tour

Contact Info

Product

Resources

About