Searching for a Robust Neural Architecture in Four GPU Hours

Dong, Xuanyi; Yang, Yi

doi:10.1109/cvpr.2019.00186

Cited by 562 publications

(551 citation statements)

References 23 publications

Supporting

Mentioning

519

Contrasting

Order By: Relevance

“…The deeper and wider architectures of deep CNNs bring about the superior performance of computer vision tasks [6,26,45]. However, they also cause the prohibitively expensive computational cost and make the model deployment on mobile devices hard if not impossible.…”

Section: Introductionmentioning

confidence: 99%

Filter Pruning via Geometric Median for Deep Convolutional Neural Networks Acceleration

Ping

Wang³

et al. 2019

2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Self Cite

961

770

View full text Add to dashboard Cite

Previous works utilized "smaller-norm-less-important" criterion to prune filters with smaller norm values in a convolutional neural network. In this paper, we analyze this norm-based criterion and point out that its effectiveness depends on two requirements that are not always met: (1) the norm deviation of the filters should be large; (2) the minimum norm of the filters should be small. To solve this problem, we propose a novel filter pruning method, namely Filter Pruning via Geometric Median (FPGM), to compress the model regardless of those two requirements. Unlike previous methods, FPGM compresses CNN models by pruning filters with redundancy, rather than those with "relatively less" importance. When applied to two image classification benchmarks, our method validates its usefulness and strengths. Notably, on CIFAR-10, FPGM reduces more than 52% FLOPs on ResNet-110 with even 2.69% relative accuracy improvement. Moreover, on ILSVRC-2012, FPGM reduces more than 42% FLOPs on ResNet-101 without top-5 accuracy drop, which has advanced the state-of-the-art. Code is publicly available on GitHub: https://github.com/he-y/filter-pruning-geometric-median

show abstract

Section: Introductionmentioning

confidence: 99%

Filter Pruning via Geometric Median for Deep Convolutional Neural Networks Acceleration

Ping

Wang³

et al. 2019

2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Self Cite

961

770

View full text Add to dashboard Cite

show abstract

“…In order to back-propagate gradient though

, we propose using the Gumbel-Max trick [ 39 , 40 ] to re-formulate Equation ( 1 ), which makes it possible to sample from a discrete probability distribution in an efficient way, as can see in ( 5 ) and ( 6 ). This method is proposed to perform NAS for the first time in GDAS [ 41 ]. DARTS needs to keep all intermediate results in memory, but the Gumbel-Max trick selects only one operation at a time.…”

Section: Methodsmentioning

confidence: 99%

NAS-HRIS: Automatic Design and Architecture Search of Neural Network for Semantic Segmentation in Remote Sensing Images

Zhang

Jing

Lin

et al. 2020

Sensors

View full text Add to dashboard Cite

The segmentation of high-resolution (HR) remote sensing images is very important in modern society, especially in the fields of industry, agriculture and urban modelling. Through the neural network, the machine can effectively and accurately extract the surface feature information. However, using the traditional deep learning methods requires plentiful efforts in order to find a robust architecture. In this paper, we introduce a neural network architecture search (NAS) method, called NAS-HRIS, which can automatically search neural network architecture on the dataset. The proposed method embeds a directed acyclic graph (DAG) into the search space and designs the differentiable searching process, which enables it to learn an end-to-end searching rule by using gradient descent optimization. It uses the Gumbel-Max trick to provide an efficient way when drawing samples from a non-continuous probability distribution, and it improves the efficiency of searching and reduces the memory consumption. Compared with other NAS, NAS-HRIS consumes less GPU memory without reducing the accuracy, which corresponds to a large amount of HR remote sensing imagery data. We have carried out experiments on the WHUBuilding dataset and achieved 90.44% MIoU. In order to fully demonstrate the feasibility of the method, we made a new urban Beijing Building dataset, and conducted experiments on satellite images and non-single source images, achieving better results than SegNet, U-Net and Deeplab v3+ models, while the computational complexity of our network architecture is much smaller.

show abstract

“…The dense units are also fixed as 512, 1024, 2048 and 4096 for MNIST, CIFAR10, CIFAR100 and Tiny-ImageNet experiments. However, these hyperparameters may also be encoded in the search space and then searched using Binary CSA as demonstrated in [37]. Furthermore, the ablation experiments are performed to study the impact of tournament select method over random selection and our proposed dynamic flight length distribution , (Eq.…”

Section: Methodsmentioning

confidence: 99%

“…This method is also simpler than RL based methods as it does not involve controller. GDAS [37] proposes to use a differentiable architecture sampler and applies it to directed acyclic graphs (DAGs).…”

Section: Differential Evolution Based Neural Architecture Searchmentioning

confidence: 99%

Image Classification Based on Automatic Neural Architecture Search Using Binary Crow Search Algorithm

et al. 2020

View full text Add to dashboard Cite

Neural architectures have accelerated the advancement in various domains by enabling automatic pattern detection, image classification, audio recognition, and face recognition etc. However, they are computationally expensive to design and expert knowledge in various domains is required. In this paper, a swarm intelligence algorithm is proposed to search for novel architectures without human intervention that can achieve comparable performance to those of human-designed architectures. This work is inspired by current neural architecture search approaches based on reinforcement learning and genetic algorithm. However, not much attention is paid towards swarm intelligence metaheuristics-based neural architecture search. A framework is proposed for automatically designing neural architectures based on a swarm intelligence metaheuristic: Crow Search Algorithm. First, Crow Search Algorithm is integrated with binary network representation. To make it compatible for Neural Architecture Search, the original distance metric is replaced with hamming distance-based similarity measure. Second, the tuning parameters of Crow Search Algorithm are reduced by replacing the static flight length parameter with our dynamic flight length distribution algorithm. Third, the target selection method (random selection) is replaced by tournament select method. The proposed framework is used to search for architectures on MNIST, CIFAR10, and CIFAR100 datasets and achieved 0.18%, 3.48%, and 15.64% test error, respectively. Furthermore, smallscale transfer experiments are conducted to search architectures for Tiny ImageNet and achieved 34.43% test error. Nonparametric statistical analysis is performed to validate the impact of each modification in improving the quality of search space exploration. The proposed framework has achieved comparable performance with the state-of-the-art approaches, with a comparatively simpler approach and minimum human intervention. The proposed framework can be used to develop completely automated systems for designing architectures for various data-based classification applications.

show abstract

Searching for a Robust Neural Architecture in Four GPU Hours

Cited by 562 publications

References 23 publications

Filter Pruning via Geometric Median for Deep Convolutional Neural Networks Acceleration

Filter Pruning via Geometric Median for Deep Convolutional Neural Networks Acceleration

NAS-HRIS: Automatic Design and Architecture Search of Neural Network for Semantic Segmentation in Remote Sensing Images

Image Classification Based on Automatic Neural Architecture Search Using Binary Crow Search Algorithm

Contact Info

Product

Resources

About