2023
DOI: 10.1109/tnnls.2021.3105247
|View full text |Cite
|
Sign up to set email alerts
|

A Facial Landmark Detection Method Based on Deep Knowledge Transfer

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
2

Relationship

0
6

Authors

Journals

citations
Cited by 7 publications
(3 citation statements)
references
References 38 publications
0
3
0
Order By: Relevance
“…Guo et al [ 27 ] trained a lightweight network consisting of the MobileNetV2 [ 36 ] blocks by using an auxiliary 3D pose estimator. To utilize the learning ability of large models, some recent works [ 28 , 29 , 30 , 31 ] used the teacher-guided KD technique to make a small student network learn the dark knowledge from a large teacher network. The student networks were usually based on the existing lightweight networks (e.g., MobileNetV2, EfficientNet-B0 [ 37 ], and HRNetV2-W9 [ 34 ]), while the teacher networks use the large CNN models (e.g., ResNet-50, EfficientNet-B7 [ 37 ], and HRNetV2-W18 [ 34 ]) as the network backbone.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Guo et al [ 27 ] trained a lightweight network consisting of the MobileNetV2 [ 36 ] blocks by using an auxiliary 3D pose estimator. To utilize the learning ability of large models, some recent works [ 28 , 29 , 30 , 31 ] used the teacher-guided KD technique to make a small student network learn the dark knowledge from a large teacher network. The student networks were usually based on the existing lightweight networks (e.g., MobileNetV2, EfficientNet-B0 [ 37 ], and HRNetV2-W9 [ 34 ]), while the teacher networks use the large CNN models (e.g., ResNet-50, EfficientNet-B7 [ 37 ], and HRNetV2-W18 [ 34 ]) as the network backbone.…”
Section: Related Workmentioning
confidence: 99%
“…Recently, some researchers have tended to balance the accuracy and efficiency of a facial landmark detector. They either train a small model from scratch [ 26 , 27 ] or use knowledge distillation (KD) for model compression [ 28 , 29 , 30 , 31 ]. The former aims to design a lightweight network combined with an effective learning strategy, while the latter considers how to apply the KD technique to transfer the dark knowledge from a large network to a small one.…”
Section: Introductionmentioning
confidence: 99%
“…Hannane et al [39] learned a FLM topological model that performs divide-conquer search for different patches of the face using coarse to fine CNN techniques and subsequently refines the landmarks positions by using a shallow cascaded CNN regression. Gao has developed a supervised encoder-decoder architecture [40] based on EfficientNet-B0 where the dark knowledge extracted from teacher network is used to supervise the training of a small student network and patch similarity (PS) distillation is used learn the structural information of the face.…”
Section: Coarse-to-fine Techniquesmentioning
confidence: 99%