2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022
DOI: 10.1109/cvpr52688.2022.00919
|View full text |Cite
|
Sign up to set email alerts
|

Localization Distillation for Dense Object Detection

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
16
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
4

Relationship

0
9

Authors

Journals

citations
Cited by 98 publications
(24 citation statements)
references
References 45 publications
0
16
0
Order By: Relevance
“…In early versions of YOLOv6, the self-distillation is only introduced in large models (i.e., YOLOv6-M/L), which applies the vanilla knowledge distillation technique by minimizing the KL-divergence between the class prediction of the teacher and the student. Meanwhile DFL [8] is adopted as regression loss to perform self-distillation on box regression similar to [19].…”
Section: Self-distillationmentioning
confidence: 99%
“…In early versions of YOLOv6, the self-distillation is only introduced in large models (i.e., YOLOv6-M/L), which applies the vanilla knowledge distillation technique by minimizing the KL-divergence between the class prediction of the teacher and the student. Meanwhile DFL [8] is adopted as regression loss to perform self-distillation on box regression similar to [19].…”
Section: Self-distillationmentioning
confidence: 99%
“…Li et al [24] used region proposals of the larger network to help the smaller network learn higher semantic information. Zheng et al [25] transferred the knowledge distillation of the classification head to the location head of object detection, leading to a new distillation mechanism termed Localization Distillation (LD). LD makes logit mimicking become a better alternative to feature imitation, and reveals the knowledge of object category and object location should be handled separately.…”
Section: B Knowledge Distillationmentioning
confidence: 99%
“…Recently, (Dai et al 2021;Yang et al 2021;Chen et al 2021;Zhang and Ma 2020) achieve feature-based distillation by focusing on the foreground area or considering a weight matrix for the features. LD (Zheng et al 2022) implements the difficult problem of localization distillation from the response level by converting the regression of bounding boxes to the probability distribution representation. Besides, Cross-modal feature distillation approaches (Chong et al 2022;Guo et al 2021) are gaining popularity as a way to take advantage of the complementarity between different modalities.…”
Section: Related Workmentioning
confidence: 99%
“…current KD methods of object detection can be mainly classified into the feature-based and response-based streams, in which the former carry out distillation at the feature level (Zagoruyko and Komodakis 2017;Romero et al 2014;Huang and Wang 2017;Heo et al 2019;Ye et al 2020;Du et al 2020) for enforcing the consistency of feature representations between the teacher-student pair whereas the latter adopts the confident prediction from the teacher model as soft targets in addition to the hard ground truth supervision (Yuan et al 2020;Zheng et al 2022;Dai et al 2021). However, directly migrating the existing KD methods to LiDAR-to-stereo cross-modal distillation is less effective due to the huge gap between the two extremely different modalities.…”
Section: Introductionmentioning
confidence: 99%