Deep Geometric Knowledge Distillation with Graphs

Lassance, Carlos; Bontonou, Myriam; Hacene, Ghouthi Boukli; Gripon, Vincent; Tang, Jian; Ortega, Antonio

doi:10.1109/icassp40776.2020.9053986

Cited by 30 publications

(36 citation statements)

References 23 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…As previously mentioned, in this work, we are interested in using graphs to ensure that latent spaces of DL architectures have some desirable properties. The various approaches we introduce in this paper are based on our previous contributions [8,10,14]. However, in this paper, they are presented for the first time using a unified methodology and formalism.…”

Section: Related Workmentioning

confidence: 99%

“…As a matter of fact, using these hand-crafted features as intermediate representations can cause sub-optimal solutions [5]. On the other hand, completely removing all constraints on the intermediate representations can cause the learning procedure to exhibit unwanted behavior, such as susceptibility to deviations of the inputs [6][7][8], or redundant features [9,10].…”

Section: Introductionmentioning

confidence: 99%

“…While there are many problems for which specific intermediate layer properties are beneficial, in this work, we consider three examples. First, we explore compression via knowledge distillation (KD) [9][10][11][12], where the goal is to supervise the training procedure of a small DL architecture (called the student) with a larger one (called the teacher). Second, we study the design of efficient embeddings for classification [13,14], in which the aim is to train the DL architecture to be able to extract features that are useful for classification (and could be used by different classifier) rather than using classification accuracy as the sole performance metric.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Representing Deep Neural Networks Latent Space Geometries with Graphs

2021

Self Cite

View full text Add to dashboard Cite

Deep Learning (DL) has attracted a lot of attention for its ability to reach state-of-the-art performance in many machine learning tasks. The core principle of DL methods consists of training composite architectures in an end-to-end fashion, where inputs are associated with outputs trained to optimize an objective function. Because of their compositional nature, DL architectures naturally exhibit several intermediate representations of the inputs, which belong to so-called latent spaces. When treated individually, these intermediate representations are most of the time unconstrained during the learning process, as it is unclear which properties should be favored. However, when processing a batch of inputs concurrently, the corresponding set of intermediate representations exhibit relations (what we call a geometry) on which desired properties can be sought. In this work, we show that it is possible to introduce constraints on these latent geometries to address various problems. In more detail, we propose to represent geometries by constructing similarity graphs from the intermediate representations obtained when processing a batch of inputs. By constraining these Latent Geometry Graphs (LGGs), we address the three following problems: (i) reproducing the behavior of a teacher architecture is achieved by mimicking its geometry, (ii) designing efficient embeddings for classification is achieved by targeting specific geometries, and (iii) robustness to deviations on inputs is achieved via enforcing smooth variation of geometry between consecutive latent spaces. Using standard vision benchmarks, we demonstrate the ability of the proposed geometry-based methods in solving the considered problems.

show abstract

Section: Related Workmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Representing Deep Neural Networks Latent Space Geometries with Graphs

2021

Self Cite

View full text Add to dashboard Cite

show abstract

“…Distilling a model into itself, or self-distillation has also proven to be effective when iterated [23]. While individual knowledge distillation focused on the student mimicking the outputs of the teacher, relational knowledge distillation [24,25] made it reproduce the same relations and distances between training examples, yielding a better representation of the latent space for the student, and better generalization capabilities.…”

Section: Distillationmentioning

confidence: 99%

ThriftyNets: Convolutional Neural Networks with Tiny Parameter Budget

Coiffier

Hacene

Gripon

2021

IoT

Self Cite

View full text Add to dashboard Cite

Deep Neural Networks are state-of-the-art in a large number of challenges in machine learning. However, to reach the best performance they require a huge pool of parameters. Indeed, typical deep convolutional architectures present an increasing number of feature maps as we go deeper in the network, whereas spatial resolution of inputs is decreased through downsampling operations. This means that most of the parameters lay in the final layers, while a large portion of the computations are performed by a small fraction of the total parameters in the first layers. In an effort to use every parameter of a network at its maximum, we propose a new convolutional neural network architecture, called ThriftyNet. In ThriftyNet, only one convolutional layer is defined and used recursively, leading to a maximal parameter factorization. In complement, normalization, non-linearities, downsamplings and shortcut ensure sufficient expressivity of the model. ThriftyNet achieves competitive performance on a tiny parameters budget, exceeding 91% accuracy on CIFAR-10 with less than 40 k parameters in total, 74.3% on CIFAR-100 with less than 600 k parameters, and 67.1% On ImageNet ILSVRC 2012 with no more than 4.15 M parameters. However, the proposed method typically requires more computations than existing counterparts.

show abstract

“…Park et al (2019) designed a relational potential function that facilitated transferring the mutual relations of teacher's output to the student. With a similar notion, Lassance et al (2020) built graphs for both the student and teacher. Latent representation geometry was then transferred by measuring the discrepancy between corresponding adjacency matrices.…”

Section: Knowledge Distillationmentioning

confidence: 99%

Revisiting knowledge distillation for light-weight visual object detection

Gao

et al. 2021

Transactions of the Institute of Measurement and Control

View full text Add to dashboard Cite

An essential element for intelligent perception in mechatronic and robotic systems (M&RS) is the visual object detection algorithm. With the ever-increasing advance of artificial neural networks (ANN), researchers have proposed numerous ANN-based visual object detection methods that have proven to be effective. However, networks with cumbersome structures do not befit the real-time scenarios in M&RS, necessitating the techniques of model compression. In the paper, a novel approach to training light-weight visual object detection networks is developed by revisiting knowledge distillation. Traditional knowledge distillation methods are oriented towards image classification is not compatible with object detection. Therefore, a variant of knowledge distillation is developed and adapted to a state-of-the-art keypoint-based visual detection method. Two strategies named as positive sample retaining and early distribution softening are employed to yield a natural adaption. The mutual consistency between teacher model and student model is further promoted through a hint-based distillation. By extensive controlled experiments, the proposed method is testified to be effective in enhancing the light-weight network’s performance by a large margin.

show abstract

Deep Geometric Knowledge Distillation with Graphs

Cited by 30 publications

References 23 publications

Representing Deep Neural Networks Latent Space Geometries with Graphs

Representing Deep Neural Networks Latent Space Geometries with Graphs

ThriftyNets: Convolutional Neural Networks with Tiny Parameter Budget

Revisiting knowledge distillation for light-weight visual object detection

Contact Info

Product

Resources

About