2019
DOI: 10.48550/arxiv.1906.00619
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Deep Face Recognition Model Compression via Knowledge Transfer and Distillation

Abstract: Fully convolutional networks (FCNs) have become de facto tool to achieve very high-level performance for many vision and non-vision tasks in general and face recognition in particular. Such high-level accuracies are normally obtained by very deep networks or their ensemble. However, deploying such high performing models to resource constraint devices or real-time applications is challenging. In this paper, we present a novel model compression approach based on student-teacher paradigm for face recognition appl… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3

Citation Types

0
2
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(3 citation statements)
references
References 20 publications
0
2
0
Order By: Relevance
“…In addition, employing a loop training strategy to train multiple networks simultaneously and weakening the relationship between instruction and learning can further optimize the teacher-student strategy [18]. To address the problem of low resolution, a teacher-student strategy using the same architecture was proposed in [19] which can be applied to train images with different resolutions. Also, some improvements have been made to the traditional distillation methods for the semantic segmentation task, such as using an association adaptation module to enable the student model to obtain and extract more information when learning about the teacher's knowledge [20].…”
Section: Introductionmentioning
confidence: 99%
“…In addition, employing a loop training strategy to train multiple networks simultaneously and weakening the relationship between instruction and learning can further optimize the teacher-student strategy [18]. To address the problem of low resolution, a teacher-student strategy using the same architecture was proposed in [19] which can be applied to train images with different resolutions. Also, some improvements have been made to the traditional distillation methods for the semantic segmentation task, such as using an association adaptation module to enable the student model to obtain and extract more information when learning about the teacher's knowledge [20].…”
Section: Introductionmentioning
confidence: 99%
“…For face recognition knowledge distillation, there have been several attempts (Wang, Lan, and Zhang 2017;Luo et al 2016;Karlekar, Feng, and Pranata 2019;Ge et al 2018;Feng et al 2019;Peng et al 2019;Wang et al 2019aWang et al , 2020a in literatures to distil large CNNs, so as to make their deployments easier. Hinton et al (Hinton, Vinyals, and Dean.…”
Section: Introductionmentioning
confidence: 99%
“…Luo et al (Luo et al 2016) propose a neuron selection method by leveraging the essential characteristics (domain knowledge) of the learned face representation. Karlekar et al (Karlekar, Feng, and Pranata 2019) simultaneously exploit one-hot labels and feature vectors for the knowledge transfer between different face resolutions. Ge et al (Ge et al 2018) develop a selective knowledge distillation, which selectively distils the most informative facial features by solving a sparse graph optimization problem.…”
Section: Introductionmentioning
confidence: 99%