Bag of Tricks for Image Classification with Convolutional Neural Networks

He, Tong; Zhang, Zhi; Zhang, Hang; Zhang, Zhongyue; Xie, Jin; Li, Mu

doi:10.1109/cvpr.2019.00065

Cited by 1,102 publications

(546 citation statements)

References 28 publications

Supporting

Mentioning

536

Contrasting

Unclassified

Order By: Relevance

“…One set of techniques that is extremely useful in practice are the tweaks to the ResNet architecture described in [46]. These approaches are used by default in XResNet.…”

Section: Layers and Architecturesmentioning

confidence: 99%

Fastai: A Layered API for Deep Learning

2020

View full text Add to dashboard Cite

fastai is a deep learning library which provides practitioners with high-level components that can quickly and easily provide state-of-the-art results in standard deep learning domains, and provides researchers with low-level components that can be mixed and matched to build new approaches. It aims to do both things without substantial compromises in ease of use, flexibility, or performance. This is possible thanks to a carefully layered architecture, which expresses common underlying patterns of many deep learning and data processing techniques in terms of decoupled abstractions. These abstractions can be expressed concisely and clearly by leveraging the dynamism of the underlying Python language and the flexibility of the PyTorch library. fastai includes: a new type dispatch system for Python along with a semantic type hierarchy for tensors; a GPU-optimized computer vision library which can be extended in pure Python; an optimizer which refactors out the common functionality of modern optimizers into two basic pieces, allowing optimization algorithms to be implemented in 4-5 lines of code; a novel 2-way callback system that can access any part of the data, model, or optimizer and change it at any point during training; a new data block API; and much more. We have used this library to successfully create a complete deep learning course, which we were able to write more quickly than using previous approaches, and the code was more clear. The library is already in wide use in research, industry, and teaching.

show abstract

“…One set of techniques that is extremely useful in practice are the tweaks to the ResNet architecture described in [46]. These approaches are used by default in XResNet.…”

Section: Layers and Architecturesmentioning

confidence: 99%

Fastai: A Layered API for Deep Learning

2020

View full text Add to dashboard Cite

show abstract

“…While in the shortcut part, in order to make sure that the feature vectors of two parts have the same size, the convolution layer is used to change the dimensionality of input data. We perform batch normalization (BN) [44] right after the convolutions to prevent the gradient dispersion problem. ReLU [45] is performed after the adding to the shortcut.…”

Section: Block 2: 3d-resnext Spectral-spatial Feature Learningmentioning

confidence: 99%

Three-Dimensional ResNeXt Network Using Feature Fusion and Label Smoothing for Hyperspectral Image Classification

Cui

Gan

et al. 2020

Sensors

View full text Add to dashboard Cite

In recent years, deep learning methods have been widely used in the hyperspectral image (HSI) classification tasks. Among them, spectral-spatial combined methods based on the three-dimensional (3-D) convolution have shown good performance. However, because of the three-dimensional convolution, increasing network depth will result in a dramatic rise in the number of parameters. In addition, the previous methods do not make full use of spectral information. They mostly use the data after dimensionality reduction directly as the input of networks, which result in poor classification ability in some categories with small numbers of samples. To address the above two issues, in this paper, we designed an end-to-end 3D-ResNeXt network which adopts feature fusion and label smoothing strategy further. On the one hand, the residual connections and split-transform-merge strategy can alleviate the declining-accuracy phenomenon and decrease the number of parameters. We can adjust the hyperparameter cardinality instead of the network depth to extract more discriminative features of HSIs and improve the classification accuracy. On the other hand, in order to improve the classification accuracies of classes with small numbers of samples, we enrich the input of the 3D-ResNeXt spectral-spatial feature learning network by additional spectral feature learning, and finally use a loss function modified by label smoothing strategy to solve the imbalance of classes. The experimental results on three popular HSI datasets demonstrate the superiority of our proposed network and an effective improvement in the accuracies especially for the classes with small numbers of training samples.

show abstract

“…The distortion correction and up-sampling were complementary, as their fusion significantly improved both. As He et al [27] proposes, the learning rate based on cosine attenuation should be used for image learning with high detail requirements. The learning rate was initially set to 10 −2 , and then decreased according to a fixed schedule.…”

Section: The Implementation Processmentioning

confidence: 99%

An End-To-End Model for Pipe Crack Three-Dimensional Visualization Based on a Cascade Neural Network

Xia

Wang

et al. 2020

Applied Sciences

View full text Add to dashboard Cite

With the continuous progress of machine vision technology, crack detection in pipelines has been greatly improved. For crack detection in deep holes, inner tubes, and other environments, it is not only necessary to detect the existence of cracks, but also to collect important information regarding the crack detection direction for further analysis. Because shooting with a frontal field of view causes the real side wall images to produce certain distortions, the detection and calibration of cracks requires a certain amount of professional technology and time. It usually takes a long time to collect the image to eliminate the distortion, and then to identify the crack and mark the direction according to the data line. Therefore, a simple and efficient end-to-end neural network model for crack recognition and three-dimensional visualization are proposed by using a cascade network and simple recognition technology in conjunction with inertial navigation equipment. In addition, we screen the crack data via pixel calibration and eliminate the ambiguous data to make the visualization more accurate. Experiments in pipelines and burrows show that the accuracy, performance, and efficiency of the proposed method reached a high level.

show abstract

Bag of Tricks for Image Classification with Convolutional Neural Networks

Cited by 1,102 publications

References 28 publications

Fastai: A Layered API for Deep Learning

Fastai: A Layered API for Deep Learning

Three-Dimensional ResNeXt Network Using Feature Fusion and Label Smoothing for Hyperspectral Image Classification

An End-To-End Model for Pipe Crack Three-Dimensional Visualization Based on a Cascade Neural Network

Contact Info

Product

Resources

About