Evaluation of Pooling Operations in Convolutional Architectures for Object Recognition

Scherer, Dominik; Müller, Andreas; Behnke, Sven

doi:10.1007/978-3-642-15825-4_10

Cited by 1,079 publications

(658 citation statements)

References 19 publications

Supporting

Mentioning

646

Contrasting

Unclassified

Order By: Relevance

“…5.5, 5.16) (Ranzato et al, 2007). Advantages of doing this were pointed out subsequently (Scherer et al, 2010). BP-trained MPCNNs have become central to many modern, competition-winning, feedforward, visual Deep Learners (Sec.…”

Section: : Ul-based History Compression Through a Deep Stack Of Rnnsmentioning

confidence: 99%

Deep learning in neural networks: An overview

2015

View full text Add to dashboard Cite

In recent years, deep artificial neural networks (including recurrent ones) have won numerous contests in pattern recognition and machine learning. This historical survey compactly summarises relevant work, much of it from the previous millennium. Shallow and deep learners are distinguished by the depth of their credit assignment paths, which are chains of possibly learnable, causal links between actions and effects. I review deep supervised learning (also recapitulating the history of backpropagation), unsupervised learning, reinforcement learning & evolutionary computation, and indirect search for short programs encoding deep and large networks.LATEX source: http://www.idsia.ch/˜juergen/DeepLearning8Oct2014.tex Complete BIBTEX file (888 kB): http://www.idsia.ch/˜juergen/deep.bib Preface This is the preprint of an invited Deep Learning (DL) overview. One of its goals is to assign credit to those who contributed to the present state of the art. I acknowledge the limitations of attempting to achieve this goal. The DL research community itself may be viewed as a continually evolving, deep network of scientists who have influenced each other in complex ways. Starting from recent DL results, I tried to trace back the origins of relevant ideas through the past half century and beyond, sometimes using "local search" to follow citations of citations backwards in time. Since not all DL publications properly acknowledge earlier relevant work, additional global search strategies were employed, aided by consulting numerous neural network experts. As a result, the present preprint mostly consists of references. Nevertheless, through an expert selection bias I may have missed important work. A related bias was surely introduced by my special familiarity with the work of my own DL research group in the past quarter-century. For these reasons, this work should be viewed as merely a snapshot of an ongoing credit assignment process. To help improve it, please do not hesitate to send corrections and suggestions to juergen@idsia.ch.

show abstract

Section: : Ul-based History Compression Through a Deep Stack Of Rnnsmentioning

confidence: 99%

Deep learning in neural networks: An overview

2015

View full text Add to dashboard Cite

show abstract

“…The biggest architectural difference between our DNN and the CNN of LeCun et al (1998) is the use of max-pooling layers (Riesenhuber & Poggio, 1999;Serre et al, 2005;Scherer et al, 2010) instead of sub-sampling layers. The output of a max-pooling layer is given by the maximum activation over non-overlapping rectangular regions of size K x × K y .…”

Section: Max-pooling Layermentioning

confidence: 99%

“…These findings in conjunction with experimental studies of the visual cortex justify the use of such filters in the so-called standard model for object recognition (Riesenhuber & Poggio, 1999;Serre et al, 2005;Mutch & Lowe, 2008), whose filters are fixed, in contrast to those of Convolutional Neural Networks (CNNs) (LeCun et al, 1998;Behnke, 2003;Simard et al, 2003), whose weights (filters) are randomly initialized and learned in a supervised way using back-propagation (BP). A DNN, the basic building block of our proposed MCDNN, is a hierarchical deep neural network, alternating convolutional with max-pooling layers (Riesenhuber & Poggio, 1999;Serre et al, 2005;Scherer et al, 2010). A single DNN of our team won the offline Chinese character recognition competition (Liu et al, 2011), a classification problem with 3755 classes.…”

Section: Introductionmentioning

confidence: 99%

Multi-column deep neural network for traffic sign classification

et al. 2012

View full text Add to dashboard Cite

We describe the approach that won the final phase of the German traffic sign recognition benchmark. Our method is the only one that achieved a betterthan-human recognition rate of 99.46%. We use a fast, fully parameterizable GPU implementation of a Deep Neural Network (DNN) that does not require careful design of pre-wired feature extractors, which are rather learned in a supervised way. Combining various DNNs trained on differently preprocessed data into a Multi-Column DNN (MCDNN) further boosts recognition performance, making the system insensitive also to variations in contrast and illumination.

show abstract

“…Recent advances leading to success in the ImageNet challenge stem from dropout to prevent overfitting [3], rectifying linear units for improved convergence, backpropagation through max-pooling [4] and GPU implementations for speed, all of which are also used in this work.…”

Section: Related Workmentioning

confidence: 99%

Structured Prediction for Object Detection in Deep Neural Networks

Schulz

Behnke

2014

Artificial Neural Networks and Machine Learning – ICANN 2014

Self Cite

View full text Add to dashboard Cite

Abstract. Deep convolutional neural networks are currently applied to computer vision tasks, especially object detection. Due to the large dimensionality of the output space, four dimensions per bounding box of an object, classification techniques do not apply easily. We propose to adapt a structured loss function for neural network training which directly maximizes overlap of the prediction with ground truth bounding boxes. We show how this structured loss can be implemented efficiently, and demonstrate bounding box prediction on two of the Pascal VOC 2007 classes.

show abstract

Evaluation of Pooling Operations in Convolutional Architectures for Object Recognition

Cited by 1,079 publications

References 19 publications

Deep learning in neural networks: An overview

Deep learning in neural networks: An overview

Multi-column deep neural network for traffic sign classification

Structured Prediction for Object Detection in Deep Neural Networks

Contact Info

Product

Resources

About