TI-POOLING: Transformation-Invariant Pooling for Feature Learning in Convolutional Neural Networks

Laptev, Dmitry; Savinov, Nikolay; Buhmann, Joachim M.; Pollefeys, Marc

doi:10.1109/cvpr.2016.38

Cited by 229 publications

(206 citation statements)

References 23 publications

Supporting

Mentioning

201

Contrasting

Unclassified

Order By: Relevance

“…Despite the effectiveness of data augmentation, the main drawback is that learning all possible transformations usually requires a large number of network parameters, which significantly increases the training cost and the risk of overfitting. Most recently, TI-Pooling [18] alleviates the drawback by using parallel network architectures for the transformation set and applying the transformation invariant pooling operator on the outputs before the top layer. Nevertheless, with a builtin data augmentation, TI-Pooling requires significantly more training and testing computational cost than a standard CNN.…”

Section: B Learning Feature Representationsmentioning

confidence: 99%

“…The learning weight decay is set as 0.00005, and the learning rate is reduced to half per 25 epochs. The state-of-the-art STN [19], TI-Pooling [18], ResNet [13] and ORNs [2] are exploited for comparison. Among them, STN is more robust to spatial transformation than the baseline CNNs, due to a spatial transform layer prior to the first convolution layer.…”

Section: A Mnistmentioning

confidence: 99%

See 1 more Smart Citation

Gabor Convolutional Networks

Luan

Chen

Zhang

et al. 2018

IEEE Trans. on Image Process.

252

View full text Add to dashboard Cite

Abstract-In steerable filters, a filter of arbitrary orientation can be generated by a linear combination of a set of "basis filters". Steerable properties dominate the design of the traditional filters e.g., Gabor filters and endow features the capability of handling spatial transformations. However, such properties have not yet been well explored in the deep convolutional neural networks (DCNNs). In this paper, we develop a new deep model, namely Gabor Convolutional Networks (GCNs or Gabor CNNs), with Gabor filters incorporated into DCNNs such that the robustness of learned features against the orientation and scale changes can be reinforced. By manipulating the basic element of DCNNs, i.e., the convolution operator, based on Gabor filters, GCNs can be easily implemented and are readily compatible with any popular deep learning architecture. We carry out extensive experiments to demonstrate the promising performance of our GCNs framework and the results show its superiority in recognizing objects, especially when the scale and rotation changes take place frequently. Moreover, the proposed GCNs have much fewer network parameters to be learned and can effectively reduce the training complexity of the network, leading to a more compact deep learning model while still maintaining a high feature representation capacity. The source code can be found at https://github.com/bczhangbczhang .

show abstract

Section: B Learning Feature Representationsmentioning

confidence: 99%

Section: A Mnistmentioning

confidence: 99%

Gabor Convolutional Networks

Luan

Chen

Zhang

et al. 2018

IEEE Trans. on Image Process.

252

View full text Add to dashboard Cite

show abstract

“…This can be as simple as averaging the features (Anselmi et al 2016), max-pooling (Laptev et al 2016;Cohen and Welling 2016) or simply by exploiting the group symmetry [such as ignoring the gradient 'sign' in Dalal and Triggs (2005) for vertical flip invariance].…”

Section: Related Workmentioning

confidence: 99%

Understanding Image Representations by Measuring Their Equivariance and Equivalence

Lenc

Vedaldi

2018

Int J Comput Vis

View full text Add to dashboard Cite

Despite the importance of image representations such as histograms of oriented gradients and deep Convolutional Neural Networks (CNN), our theoretical understanding of them remains limited. Aimed at filling this gap, we investigate two key mathematical properties of representations: equivariance and equivalence. Equivariance studies how transformations of the input image are encoded by the representation, invariance being a special case where a transformation has no effect. Equivalence studies whether two representations, for example two different parameterizations of a CNN, two different layers, or two different CNN architectures, share the same visual information or not. A number of methods to establish these properties empirically are proposed, including introducing transformation and stitching layers in CNNs. These methods are then applied to popular representations to reveal insightful aspects of their structure, including clarifying at which layers in a CNN certain geometric invariances are achieved and how various CNN architectures differ. We identify several predictors of geometric and architectural compatibility, including the spatial resolution of the representation and the complexity and depth of the models. While the focus of the paper is theoretical, direct applications to structured-output regression are demonstrated too.

show abstract

“…Data augmentation [6] is the most popular technique to mitigate the effects from rotations. Despite the simplicity, it often leads to larger amount of model parameters and is prone to under-or over-fitting [7]. Another drawback of data augmentation is its black-box nature, where it is completely unknown on how the network handles various transformation.…”

Section: Introductionmentioning

confidence: 99%

Discrete Rotation Equivariance for Point Cloud Recognition

Lee

2019

2019 International Conference on Robotics and Automation (ICRA)

View full text Add to dashboard Cite

Despite the recent active research on processing point clouds with deep networks, few attention has been on the sensitivity of the networks to rotations. In this paper, we propose a deep learning architecture that achieves discrete SO(2)/SO(3) rotation equivariance for point cloud recognition. Specifically, the rotation of an input point cloud with elements of a rotation group is similar to shuffling the feature vectors generated by our approach. The equivariance is easily reduced to invariance by eliminating the permutation with operations such as maximum or average. Our method can be directly applied to any existing point cloud based networks, resulting in significant improvements in their performance for rotated inputs. We show state-of-the-art results in the classification tasks with various datasets under both SO(2) and SO(3) rotations. In addition, we further analyze the necessary conditions of applying our approach to PointNet [1] based networks.

show abstract

TI-POOLING: Transformation-Invariant Pooling for Feature Learning in Convolutional Neural Networks

Cited by 229 publications

References 23 publications

Gabor Convolutional Networks

Gabor Convolutional Networks

Understanding Image Representations by Measuring Their Equivariance and Equivalence

Discrete Rotation Equivariance for Point Cloud Recognition

Contact Info

Product

Resources

About