Drop an Octave: Reducing Spatial Redundancy in Convolutional Neural Networks With Octave Convolution

Chen, Yunpeng; Fan, Haoqi; Xu, Bing; Yan, Zhicheng; Kalantidis, Yannis; Rohrbach, Marcus; Yan, Shuicheng; Feng, Jiashi

doi:10.1109/iccv.2019.00353

Cited by 473 publications

(302 citation statements)

References 42 publications

Supporting

Mentioning

299

Contrasting

Order By: Relevance

“…Recently, there are some concurrent works aiming at improving the performance by utilizing the multi-scale features [5], [9], [11], [49]. Big-Little Net [5] is a multi-branch network composed of branches with different computational complexity.…”

Section: Concurrent Workmentioning

confidence: 99%

“…Big-Little Net [5] is a multi-branch network composed of branches with different computational complexity. Octave Conv [9] decomposes the standard convolution into two resolutions to process features at different frequencies. MSNet [11] utilizes a high-resolution network to learn high-frequency residuals by using the up-sampled low-resolution features learned by a lowresolution network.…”

Section: Concurrent Workmentioning

confidence: 99%

“…Unlike most existing methods that enhance the layer-wise multi-scale representation strength of CNNs, we improve the multi-scale representation ability at a more granular level. Different from some concurrent works [5], [9], [11] that improve the multi-scale ability by utilizing features with different resolutions, the multi-scale of our proposed method refers to the multiple available receptive fields at a more granular level. To achieve this goal, we replace the 3 × 3 filters 1 of n channels, with a set of smaller filter groups, each with w channels (without loss of generality we use n = s × w).…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Res2Net: A New Multi-Scale Backbone Architecture

Gao

Cheng

Zhao

et al. 2021

IEEE Trans. Pattern Anal. Mach. Intell.

1,823

847

View full text Add to dashboard Cite

Representing features at multiple scales is of great importance for numerous vision tasks. Recent advances in backbone convolutional neural networks (CNNs) continually demonstrate stronger multi-scale representation ability, leading to consistent performance gains on a wide range of applications. However, most existing methods represent the multi-scale features in a layer-wise manner. In this paper, we propose a novel building block for CNNs, namely Res2Net, by constructing hierarchical residual-like connections within one single residual block. The Res2Net represents multi-scale features at a granular level and increases the range of receptive fields for each network layer. The proposed Res2Net block can be plugged into the state-of-the-art backbone CNN models, e.g., ResNet, ResNeXt, and DLA. We evaluate the Res2Net block on all these models and demonstrate consistent performance gains over baseline models on widely-used datasets, e.g., CIFAR-100 and ImageNet. Further ablation studies and experimental results on representative computer vision tasks, i.e., object detection, class activation mapping, and salient object detection, further verify the superiority of the Res2Net over the state-of-the-art baseline methods. The source code and trained models are available on https://mmcheng.net/res2net/.

show abstract

Section: Concurrent Workmentioning

confidence: 99%

Section: Concurrent Workmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Res2Net: A New Multi-Scale Backbone Architecture

Gao

Cheng

Zhao

et al. 2021

IEEE Trans. Pattern Anal. Mach. Intell.

1,823

847

View full text Add to dashboard Cite

show abstract

“…An octave convolutional layer [1] factorizes the output feature maps of a convolutional layer into two groups. The resolution of the low-frequency feature maps is reduced by an octaveheight and width dimensions are divided by 2.…”

Section: Multi-scale Octave Convolutionsmentioning

confidence: 99%

“…

…”

mentioning

confidence: 99%

Multi-Scale Octave Convolutions for Robust Speech Recognition

Rownicka

Bell

Renals

2020

ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

View full text Add to dashboard Cite

We propose a multi-scale octave convolution layer to learn robust speech representations efficiently. Octave convolutions were introduced by Chen et al [1] in the computer vision field to reduce the spatial redundancy of the feature maps by decomposing the output of a convolutional layer into feature maps at two different spatial resolutions, one octave apart. This approach improved the efficiency as well as the accuracy of the CNN models. The accuracy gain was attributed to the enlargement of the receptive field in the original input space. We argue that octave convolutions likewise improve the robustness of learned representations due to the use of average pooling in the lower resolution group, acting as a low-pass filter. We test this hypothesis by evaluating on two noisy speech corpora -Aurora-4 and AMI. We extend the octave convolution concept to multiple resolution groups and multiple octaves. To evaluate the robustness of the inferred representations, we report the similarity between clean and noisy encodings using an affine projection loss as a proxy robustness measure. The results show that proposed method reduces the WER by up to 6.6% relative for Aurora-4 and 3.6% for AMI, while improving the computational efficiency of the CNN acoustic models.

show abstract

OVS‐Net: An effective feature extraction network for optical coherence tomography angiography vessel segmentation

Zhu

Wang

Xiao

et al. 2022

Computer Animation & Virtual

View full text Add to dashboard Cite

Optical coherence tomography angiography (OCTA), as a noninvasive imaging modality, has been widely used in clinical ophthalmology. However, the segmentation of retinal vessels in OCTA is under-studied due to OCTA is a relatively new technology. In this article, an effective feature extraction network, OVS-Net, is proposed for OCTA vessel segmentation. The OVS-Net is divided into coarse stage and refine stage which structures are basically the same. In each stage, we utilize OctaveResBlock as the basic block to better extract the hierarchical multifrequency features of OCTA and capture the multiscale semantic features of the vessels. In order to improve the feature characterization, feature enhanced attention block is introduced into the network, which is proved to be more conducive for microvessel segmentation in our experiments. Multiscale feature blocks are introduced into the network to promote the deep integration of semantic features at different scales. Experiments on OCTA-SS and OCTA-500 datasets show that our proposed OVS-Net achieves more competitive segmentation results than the existing methods, especially for microvessel segmentation.

show abstract

Drop an Octave: Reducing Spatial Redundancy in Convolutional Neural Networks With Octave Convolution

Cited by 473 publications

References 42 publications

Res2Net: A New Multi-Scale Backbone Architecture

Res2Net: A New Multi-Scale Backbone Architecture

Multi-Scale Octave Convolutions for Robust Speech Recognition

OVS‐Net: An effective feature extraction network for optical coherence tomography angiography vessel segmentation

Contact Info

Product

Resources

About