HetConv: Heterogeneous Kernel-Based Convolutions for Deep CNNs

Singh, Pravendra; Verma, Vinay Kumar; Rai, Piyush; Namboodiri, Vinay P.

doi:10.1109/cvpr.2019.00497

Cited by 106 publications

(76 citation statements)

References 30 publications

(79 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Pruning methods, such as DSD [15] and ThiNet [31], focus on reducing the redundancy in the model parameters by eliminating the least significant weight or connections in CNNs. Besides, Het-Conv [36] propose to replace the vanilla convolution filters with heterogeneous convolution filters that are in different sizes. However, all of these methods ignore the redundancy on the spatial dimension of feature maps, which is addressed by the proposed OctConv, making OctConv orthogonal and complementary to these previous methods.…”

Section: Related Workmentioning

confidence: 99%

Drop an Octave: Reducing Spatial Redundancy in Convolutional Neural Networks With Octave Convolution

Chen

Fan

et al. 2019

2019 IEEE/CVF International Conference on Computer Vision (ICCV)

513

295

View full text Add to dashboard Cite

In natural images, information is conveyed at different frequencies where higher frequencies are usually encoded with fine details and lower frequencies are usually encoded with global structures. Similarly, the output feature maps of a convolution layer can also be seen as a mixture of information at different frequencies. In this work, we propose to factorize the mixed feature maps by their frequencies, and design a novel Octave Convolution (OctConv) operation 1 to store and process feature maps that vary spatially "slower" at a lower spatial resolution reducing both memory and computation cost. Unlike existing multi-scale methods, OctConv is formulated as a single, generic, plug-andplay convolutional unit that can be used as a direct replacement of (vanilla) convolutions without any adjustments in the network architecture. It is also orthogonal and complementary to methods that suggest better topologies or reduce channel-wise redundancy like group or depth-wise convolutions. We experimentally show that by simply replacing convolutions with OctConv, we can consistently boost accuracy for both image and video recognition tasks, while reducing memory and computational cost. An OctConv-equipped ResNet-152 can achieve 82.9% top-1 classification accuracy on ImageNet with merely 22.2 GFLOPs.

show abstract

Section: Related Workmentioning

confidence: 99%

Drop an Octave: Reducing Spatial Redundancy in Convolutional Neural Networks With Octave Convolution

Chen

Fan

et al. 2019

2019 IEEE/CVF International Conference on Computer Vision (ICCV)

513

295

View full text Add to dashboard Cite

show abstract

“…Specifically, the heterogeneous architecture comprises heterogeneous convolutions. Here heterogeneous convolutions with P = 2 [45] are used in the IEEB, where P represents part. The heterogeneous convolutions with P = 2 denote that each standard convolution of 3×3 and each convolution of 1×1 are connected.…”

Section: Network Analysismentioning

confidence: 99%

“…The IEEB uses hierarchical LR features and residual learning techniques to enhance the memory ability of shallow layers for improving the SR performance. By incorporating the heterogeneous architecture proposed in [45] into the IEEB, the amounts of parameters and memory consumption for the IEEB are significantly reduced, so as the training time. Then, the RB fuses the extracted global and local features to transform low-frequency features (i.e., the LR features) into high-frequency features (i.e., the HR features) via residual learning and sub-pixel convolution methods.…”

Section: Introductionmentioning

confidence: 99%

Lightweight image super-resolution with enhanced CNN

Tian

Zhuge

et al. 2020

Knowledge-Based Systems

120

View full text Add to dashboard Cite

Deep convolutional neural networks (CNNs) with strong expressive ability have achieved impressive performances on single image super-resolution (SISR). However, their excessive amounts of convolutions and parameters usually consume high computational cost and more memory storage for training a SR model, which limits their applications to SR with resource-constrained devices in real world. To resolve these problems, we propose a lightweight enhanced SR CNN (LESR-CNN) with three successive sub-blocks, an information extraction and enhancement block (IEEB), a reconstruction block (RB) and an information refinement block (IRB). Specifically, the IEEB extracts hierarchical low-resolution (LR) features and aggregates the obtained features step-by-step to increase the memory ability of the shallow layers on deep layers for SISR. To remove redundant information obtained, a heterogeneous architecture is adopted in the IEEB. After that, the RB converts low-frequency features into high-frequency features by fusing global and local features, which is complementary with the IEEB in tackling the long-term dependency problem. Finally, the IRB uses coarse high-frequency features from the RB to learn more accurate SR features and construct a SR image. The proposed LESRCNN can obtain a high-quality image by a model for different scales. Extensive experiments demonstrate that the proposed LESRCNN outperforms state-of-the-arts on SISR in terms of qualitative and quantitative evaluation. The code of LESR-CNN is accessible on https://github.com/hellloxiaotian/LESRCNN.

show abstract

“…The LSTM neural network is a special recurrent neural network (RNN), which introduces a weighted connection with memory and feedback functions. Compared with the feedforward neural network, LSTM can avoid gradient explosion and gradient disappearance, so LSTM can achieve continuous learning for longer time series [42]. The LSTM hidden layer structure is shown in Figure 2.…”

Section: Long Short-term Memorymentioning

confidence: 99%

Ultra-Short-Term Load Demand Forecast Model Framework Based on Deep Learning

Liu

et al. 2020

Energies

View full text Add to dashboard Cite

Ultra-short-term load demand forecasting is significant to the rapid response and real-time dispatching of the power demand side. Considering too many random factors that affect the load, this paper combines convolution, long short-term memory (LSTM), and gated recurrent unit (GRU) algorithms to propose an ultra-short-term load forecasting model based on deep learning. Firstly, more than 100,000 pieces of historical load and meteorological data from Beijing in the three years from 2016 to 2018 were collected, and the meteorological data were divided into 18 types considering the actual meteorological characteristics of Beijing. Secondly, after the standardized processing of the time-series samples, the convolution filter was used to extract the features of the high-order samples to reduce the number of training parameters. On this basis, the LSTM layer and GRU layer were used for modeling based on time series. A dropout layer was introduced after each layer to reduce the risk of overfitting. Finally, load prediction results were output as a dense layer. In the model training process, the mean square error (MSE) was used as the objective optimization function to train the deep learning model and find the optimal super parameter. In addition, based on the average training time, training error, and prediction error, this paper verifies the effectiveness and practicability of the load prediction model proposed under the deep learning structure in this paper by comparing it with four other models including GRU, LSTM, Conv-GRU, and Conv-LSTM.

show abstract

HetConv: Heterogeneous Kernel-Based Convolutions for Deep CNNs

Cited by 106 publications

References 30 publications

Drop an Octave: Reducing Spatial Redundancy in Convolutional Neural Networks With Octave Convolution

Drop an Octave: Reducing Spatial Redundancy in Convolutional Neural Networks With Octave Convolution

Lightweight image super-resolution with enhanced CNN

Ultra-Short-Term Load Demand Forecast Model Framework Based on Deep Learning

Contact Info

Product

Resources

About