NetAdapt: Platform-Aware Neural Network Adaptation for Mobile Applications

Yang, Tien-Ju; Howard, Andrew; Chen, Bin; Zhang, Xiao; Go, Alec; Sandler, Mark; Sze, Vivienne; Adam, Hartwig

doi:10.1007/978-3-030-01249-6_18

Cited by 418 publications

(341 citation statements)

References 24 publications

Supporting

Mentioning

323

Contrasting

Order By: Relevance

“…To reduce the computational cost of search, differentiable architecture search framework is used in [28,5,45] with gradient-based optimization. Focusing on adapting existing networks to constrained mobile platforms, [48,15,12] proposed more efficient automated network simplification algorithms.…”

Section: Related Workmentioning

confidence: 99%

See 1 more Smart Citation

Searching for MobileNetV3

Howard

Pang

Adam

et al. 2019

2019 IEEE/CVF International Conference on Computer Vision (ICCV)

Self Cite

5,579

2,545

View full text Add to dashboard Cite

We present the next generation of MobileNets based on a combination of complementary search techniques as well as a novel architecture design. MobileNetV3 is tuned to mobile phone CPUs through a combination of hardwareaware network architecture search (NAS) complemented by the NetAdapt algorithm and then subsequently improved through novel architecture advances. This paper starts the exploration of how automated search algorithms and network design can work together to harness complementary approaches improving the overall state of the art. Through this process we create two new MobileNet models for release: MobileNetV3-Large and MobileNetV3-Small which are targeted for high and low resource use cases. These models are then adapted and applied to the tasks of object detection and semantic segmentation. For the task of semantic segmentation (or any dense pixel prediction), we propose a new efficient segmentation decoder Lite Reduced Atrous Spatial Pyramid Pooling (LR-ASPP). We achieve new state of the art results for mobile classification, detection and segmentation. MobileNetV3-Large is 3.2% more accurate on ImageNet classification while reducing latency by 15% compared to MobileNetV2. MobileNetV3-Small is 4.6% more accurate while reducing latency by 5% compared to MobileNetV2. MobileNetV3-Large detection is 25% faster at roughly the same accuracy as MobileNetV2 on COCO detection. MobileNetV3-Large LR-ASPP is 30% faster than MobileNetV2 R-ASPP at similar accuracy for Cityscapes segmentation.

show abstract

Section: Related Workmentioning

confidence: 99%

“…Network search has shown itself to be a very powerful tool for discovering and optimizing network architectures [53,43,5,48]. For MobileNetV3 we use platform-aware NAS to search for the global network structures by optimizing each network block.…”

Section: Network Searchmentioning

confidence: 99%

Searching for MobileNetV3

Howard

Pang

Adam

et al. 2019

2019 IEEE/CVF International Conference on Computer Vision (ICCV)

Self Cite

5,579

2,545

View full text Add to dashboard Cite

show abstract

“…State-of-the-art network compression methods can achieve significant reductions in network size, in some cases by an order of magnitude, but often require specialized software or hardware support. For example, unstructured pruning requires optimized sparse matrix multiplication routines to realize practical acceleration [26], platform constraint-aware compression [2,36,37] requires hardware simulators or empirical measurements, and arbitrarybit quantization [9,17] requires specialized hardware. One of the advantages of knowledge distillation is that it is easily implemented in any off-the-shelf deep learning framework without the need for extra software or hardware.…”

Section: Related Workmentioning

confidence: 99%

Similarity-Preserving Knowledge Distillation

Tung

Mori

2019

2019 IEEE/CVF International Conference on Computer Vision (ICCV)

734

432

View full text Add to dashboard Cite

Knowledge distillation is a widely applicable technique for training a student neural network under the guidance of a trained teacher network. For example, in neural network compression, a high-capacity teacher is distilled to train a compact student; in privileged learning, a teacher trained with privileged data is distilled to train a student without access to that data. The distillation loss determines how a teacher's knowledge is captured and transferred to the student. In this paper, we propose a new form of knowledge distillation loss that is inspired by the observation that semantically similar inputs tend to elicit similar activation patterns in a trained network. Similarity-preserving knowledge distillation guides the training of a student network such that input pairs that produce similar (dissimilar) activations in the teacher network produce similar (dissimilar) activations in the student network. In contrast to previous distillation methods, the student is not required to mimic the representation space of the teacher, but rather to preserve the pairwise similarities in its own representation space. Experiments on three public datasets demonstrate the potential of our approach.

show abstract

“…Currently, the design of computationally efficient CNNs is moving from manual tuning [17][18][19] towards automatic algorithms [20][21][22][23][24][25][26]. The incorporation of specific platform constraints to such approaches involves modeling how the network architecture relates with the optimization target.…”

Section: Related Workmentioning

confidence: 99%

PreVIous: A Methodology for Prediction of Visual Inference Performance on IoT Devices

Velasco-Montero

Fernández-Berni

Carmona-Galán

et al. 2020

IEEE Internet Things J.

View full text Add to dashboard Cite

This paper presents PreVIous, a methodology to predict the performance of convolutional neural networks (CNNs) in terms of throughput and energy consumption on vision-enabled devices for the Internet of Things. CNNs typically constitute a massive computational load for such devices, which are characterized by scarce hardware resources to be shared among multiple concurrent tasks. Therefore, it is critical to select the optimal CNN architecture for a particular hardware platform according to prescribed application requirements. However, the zoo of CNN models is already vast and rapidly growing. To facilitate a suitable selection, we introduce a prediction framework that allows to evaluate the performance of CNNs prior to their actual implementation. The proposed methodology is based on PreVIousNet, a neural network specifically designed to build accurate per-layer performance predictive models. PreVIousNet incorporates the most usual parameters found in state-of-the-art network architectures. The resulting predictive models for inference time and energy have been tested against comprehensive characterizations of seven well-known CNN models running on two different software frameworks and two different embedded platforms. To the best of our knowledge, this is the most extensive study in the literature concerning CNN performance prediction on low-power low-cost devices. The average deviation between predictions and real measurements is remarkably low, ranging from 3% to 10%. This means state-of-the-art modeling accuracy. As an additional asset, the fine-grained a priori analysis provided by PreVIous could also be exploited by neural architecture search engines.

show abstract

NetAdapt: Platform-Aware Neural Network Adaptation for Mobile Applications

Cited by 418 publications

References 24 publications

Searching for MobileNetV3

Searching for MobileNetV3

Similarity-Preserving Knowledge Distillation

PreVIous: A Methodology for Prediction of Visual Inference Performance on IoT Devices

Contact Info

Product

Resources

About