Ghouthi Boukli Hacene scite author profile

Embedding Artificial Intelligence onto low-power devices is a challenging task that has been partly overcome with recent advances in machine learning and hardware design. Presently, deep neural networks can be deployed on embedded targets to perform different tasks such as speech recognition, object detection or Human Activity Recognition. However, there is still room for optimization of deep neural networks onto embedded devices. These optimizations mainly address power consumption, memory and real-time constraints, but also an easier deployment at the edge. Moreover, there is still a need for a better understanding of what can be achieved for different use cases. This work focuses on quantization and deployment of deep neural networks onto low-power 32-bit microcontrollers. The quantization methods, relevant in the context of an embedded execution onto a microcontroller, are first outlined. Then, a new framework for end-to-end deep neural networks training, quantization and deployment is presented. This framework, called MicroAI, is designed as an alternative to existing inference engines (TensorFlow Lite for Microcontrollers and STM32Cube.AI). Our framework can indeed be easily adjusted and/or extended for specific use cases. Execution using single precision 32-bit floating-point as well as fixed-point on 8- and 16 bits integers are supported. The proposed quantization method is evaluated with three different datasets (UCI-HAR, Spoken MNIST and GTSRB). Finally, a comparison study between MicroAI and both existing embedded inference engines is provided in terms of memory and power efficiency. On-device evaluation is done using ARM Cortex-M4F-based microcontrollers (Ambiq Apollo3 and STM32L452RE).

show abstract

Deep Geometric Knowledge Distillation with Graphs

Lassance

Bontonou

Hacene

et al. 2020

View full text Add to dashboard Cite

In most cases deep learning architectures are trained disregarding the amount of operations and energy consumption. However, some applications, like embedded systems, can be resource-constrained during inference. A popular approach to reduce the size of a deep learning architecture consists in distilling knowledge from a bigger network (teacher) to a smaller one (student). Directly training the student to mimic the teacher representation can be effective, but it requires that both share the same latent space dimensions. In this work, we focus instead on relative knowledge distillation (RKD), which considers the geometry of the respective latent spaces, allowing for dimension-agnostic transfer of knowledge. Specifically we introduce a graph-based RKD method, in which graphs are used to capture the geometry of latent spaces. Using classical computer vision benchmarks, we demonstrate the ability of the proposed method to efficiently distillate knowledge from the teacher to the student, leading to better accuracy for the same budget as compared to existing RKD alternatives.

show abstract

Introducing Graph Smoothness Loss for Training Deep Learning Architectures

Bontonou¹,

Lassance²,

Hacene³

et al. 2019

View full text Add to dashboard Cite

We introduce a novel loss function for training deep learning architectures to perform classification. It consists in minimizing the smoothness of label signals on similarity graphs built at the output of the architecture. Equivalently, it can be seen as maximizing the distances between the network function images of training inputs from distinct classes. As such, only distances between pairs of examples in distinct classes are taken into account in the process, and the training does not prevent inputs from the same class to be mapped to distant locations in the output domain. We show that this loss leads to similar performance in classification as architectures trained using the classical cross-entropy, while offering interesting degrees of freedom and properties. We also demonstrate the interest of the proposed loss to increase robustness of trained architectures to deviations of the inputs.

show abstract

Training Modern Deep Neural Networks for Memory-Fault Robustness

Hacene

Leduc-Primeau

Soussia

et al. 2019

View full text Add to dashboard Cite

Because deep neural networks (DNNs) rely on a large number of parameters and computations, their implementation in energy-constrained systems is challenging. In this paper, we investigate the solution of reducing the supply voltage of the memories used in the system, which results in bit-cell faults. We explore the robustness of state-of-the-art DNN architectures towards such defects and propose a regularizer meant to mitigate their effects on accuracy. Our experiments clearly demonstrate the interest of operating the system in a faulty regime to save energy without reducing accuracy.

show abstract

Attention Based Pruning for Shift Networks

Hacene¹,

Lassance²,

Gripon³

et al. 2019

Preprint

View full text Add to dashboard Cite

BitPruning: Learning Bitlengths for Aggressive and Accurate Quantization

Nikolić¹,

Hacene²,

Bannon³

et al. 2020

Preprint

View full text Add to dashboard Cite

Neural networks have demonstrably achieved stateof-the art accuracy using low-bitlength integer quantization, yielding both execution time and energy benefits on existing hardware designs that support short bitlengths. However, the question of finding the minimum bitlength for a desired accuracy remains open. We introduce a training method for minimizing inference bitlength at any granularity while maintaining accuracy. Furthermore, we propose a regularizer that penalizes large bitlength representations throughout the architecture and show how it can be modified to minimize other quantifiable criteria, such as number of operations or memory footprint. We demonstrate that our method learns thrifty representations while maintaining accuracy. With ImageNet, the method produces an average per layer bitlength of 4.13 and 3.76 bits on AlexNet and ResNet18 respectively, remaining within 2.0% and 0.5% of the baseline TOP-1 accuracy.

show abstract

Budget Restricted Incremental Learning with Pre-Trained Convolutional Neural Networks and Binary Associative Memories

Hacene

Gripon

Farrugia

et al. 2019

J Sign Process Syst

View full text Add to dashboard Cite

Thanks to their ability to absorb large amounts of data, Convolutional Neural Networks (CNNs) have become stateof-the-art in numerous vision challenges, sometimes even on par with biological vision. They rely on optimisation routines that typically require intensive computational power, thus the question of embedded architectures is a very active field of research. Of particular interest is the problem of incremental learning, where the device adapts to new observations or classes. To tackle this challenging problem, we propose to combine pre-trained CNNs with binary associative memories, using product random sampling as an intermediate between the two methods. The obtained architecture requires significantly less computational power and memory usage than existing counterparts. Moreover, using various challenging vision datasets we show that the proposed architecture is able to perform one-shot learning -and even use only a small portion of the dataset -while keeping very good accuracy.

show abstract

Quantized Guided Pruning for Efficient Hardware Implementations of Convolutional Neural Networks

Hacene¹,

Gripon²,

Arzel³

et al. 2018

Preprint

View full text Add to dashboard Cite

12 3 4

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.