Analysis of Dropout Learning Regarded as Ensemble Learning

Hara, Kazuyuki; Saitoh, Daisuke; Shouno, Hayaru

doi:10.1007/978-3-319-44781-0_9

Cited by 36 publications

(20 citation statements)

References 8 publications

(20 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…These subnetworks then form an ensemble of small networks. Making inference using a trained deep network is akin to using the ensemble mean to make predictions, which is more robust (Baldi & Sadowski, ; Hara et al, ); and (2) because connections are randomly blocked, neuron weights cannot adjust at the same time to cancel each other's effects to fit the target (Hinton, Srivastava, et al, ). The simultaneous adjustment, termed coadaptation, is a primary reason for overfitting.…”

Section: Basicsmentioning

confidence: 99%

A Transdisciplinary Review of Deep Learning Research and Its Relevance for Water Resources Scientists

Shen

2018

Water Resources Research

705

411

View full text Add to dashboard Cite

Deep learning (DL), a new generation of artificial neural network research, has transformed industries, daily lives, and various scientific disciplines in recent years. DL represents significant progress in the ability of neural networks to automatically engineer problem‐relevant features and capture highly complex data distributions. I argue that DL can help address several major new and old challenges facing research in water sciences such as interdisciplinarity, data discoverability, hydrologic scaling, equifinality, and needs for parameter regionalization. This review paper is intended to provide water resources scientists and hydrologists in particular with a simple technical overview, transdisciplinary progress update, and a source of inspiration about the relevance of DL to water. The review reveals that various physical and geoscientific disciplines have utilized DL to address data challenges, improve efficiency, and gain scientific insights. DL is especially suited for information extraction from image‐like data and sequential data. Techniques and experiences presented in other disciplines are of high relevance to water research. Meanwhile, less noticed is that DL may also serve as a scientific exploratory tool. A new area termed AI neuroscience, where scientists interpret the decision process of deep networks and derive insights, has been born. This budding subdiscipline has demonstrated methods including correlation‐based analysis, inversion of network‐extracted features, reduced‐order approximations by interpretable models, and attribution of network decisions to inputs. Moreover, DL can also use data to condition neurons that mimic problem‐specific fundamental organizing units, thus revealing emergent behaviors of these units. Vast opportunities exist for DL to propel advances in water sciences.

show abstract

Section: Basicsmentioning

confidence: 99%

A Transdisciplinary Review of Deep Learning Research and Its Relevance for Water Resources Scientists

Shen

2018

Water Resources Research

705

411

View full text Add to dashboard Cite

show abstract

“…Once complete, an average of the discovered models was taken at test time and a performance increase was observed. Hara et al [16] proposed the idea that regularisation methods such as Dropout can be considered to be ensembling techniques. They showed that model accuracy can be improved by taking an average over a network with learned and unlearned units.…”

Section: Related Workmentioning

confidence: 99%

Evolving and Ensembling Deep CNN Architectures for Image Classification

Fielding

Lawrence

Zhang

2019

2019 International Joint Conference on Neural Networks (IJCNN)

View full text Add to dashboard Cite

Deep Convolutional Neural Networks (CNNs) have traditionally been hand-designed owing to the complexity of their construction and the computational requirements of their training. Recently however, there has been an increase in research interest towards automatically designing deep CNNs for specific tasks. Ensembling has been shown to effectively increase the performance of deep CNNs, although usually with a duplication of work and therefore a large increase in computational resources required. In this paper we present a method for automatically designing and ensembling deep CNN models with a central weight repository to avoid work duplication. The models are trained and optimised together using particle swarm optimisation (PSO), with architecture convergence encouraged. At the conclusion of the joint optimisation and training process a base model nomination method is used to determine the best candidates for the ensemble. Two base model nomination methods are proposed, one using the local best particle positions from the PSO process, and one using the contents of the central weight repository. Once the base model pool has been created, the individual models inherit their parameters from the central weight repository and are then finetuned and ensembled in order to create a final system. We evaluate our system on the CIFAR-10 classification dataset and demonstrate improved results over the single global best model suggested by the optimisation process, with a minor increase in resources required by the finetuning process. Our system achieves an error rate of 4.27% on the CIFAR-10 image classification task with only 36 hours of combined optimisation and training on a single NVIDIA GTX 1080Ti GPU.

show abstract

“…The work reported here is also clearly related to the full combination method of multi-band processing [21] where a neural network is trained on each combination of bands. In our case, however, it is not necessary to explicitly train 2 N (where N is the number of bands used) different networks, as dropout can be also regarded as an ensemble technique [22]. And given that the multi-band approach is a special case of multi-stream processing, the present study is also closely related to the multi-stream framework of Mallidi et al [23], which is dropping certain streams whilst training the network for band combination.…”

Section: Related Workmentioning

confidence: 99%

Examining the Combination of Multi-Band Processing and Channel Dropout for Robust Speech Recognition

Kovács¹,

Tóth²,

Compernolle³

et al. 2019

Interspeech 2019

View full text Add to dashboard Cite

A pivotal question in Automatic Speech Recognition (ASR) is the robustness of the trained models. In this study, we investigate the combination of two methods commonly applied to increase the robustness of ASR systems. On the one hand, inspired by auditory experiments and signal processing considerations, multi-band band processing has been used for decades to improve the noise robustness of speech recognition. On the other hand, dropout is a commonly used regularization technique to prevent overfitting by keeping the model from becoming over-reliant on a small set of neurons. We hypothesize that the careful combination of the two approaches would lead to increased robustness, by preventing the resulting model from over-rely on any given band. To verify our hypothesis, we investigate various approaches for the combination of the two methods using the Aurora-4 corpus. The results obtained corroborate our initial assumption, and show that the proper combination of the two techniques leads to increased robustness, and to significantly lower word error rates (WERs). Furthermore, we find that the accuracy scores attained here compare favourably to those reported recently on the clean training scenario of the Aurora-4 corpus.

show abstract

Analysis of Dropout Learning Regarded as Ensemble Learning

Cited by 36 publications

References 8 publications

A Transdisciplinary Review of Deep Learning Research and Its Relevance for Water Resources Scientists

A Transdisciplinary Review of Deep Learning Research and Its Relevance for Water Resources Scientists

Evolving and Ensembling Deep CNN Architectures for Image Classification

Examining the Combination of Multi-Band Processing and Channel Dropout for Robust Speech Recognition

Contact Info

Product

Resources

About