2021
DOI: 10.1109/access.2021.3099856
|View full text |Cite
|
Sign up to set email alerts
|

Teaching Yourself: A Self-Knowledge Distillation Approach to Action Recognition

Abstract: Knowledge distillation, which is a process of transferring complex knowledge learned by a heavy network, i.e., a teacher, to a lightweight network, i.e., a student, has emerged as an effective technique for compressing neural networks. To reduce the necessity of training a large teacher network, this paper leverages the recent self-knowledge distillation approach to train a student network progressively by distilling its own knowledge without a pre-trained teacher network. Far from the existing self-knowledge … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
14
0
4

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
3
2

Relationship

1
9

Authors

Journals

citations
Cited by 30 publications
(19 citation statements)
references
References 39 publications
0
14
0
4
Order By: Relevance
“…The consistent loss, e.g., KL divergence, works as an efficient way to distill the knowledge in the trained black-box model. In addition, a previous study of the knowledge distillation in a single domain (Guo et al, 2020 ; Vu et al, 2021 ) also has shown that the student model learned with the distillation can be more general (Wang et al, 2021 ). Thus, our framework can be a viable solution to train a target domain model with a decent generalization ability.…”
Section: Discussionmentioning
confidence: 98%
“…The consistent loss, e.g., KL divergence, works as an efficient way to distill the knowledge in the trained black-box model. In addition, a previous study of the knowledge distillation in a single domain (Guo et al, 2020 ; Vu et al, 2021 ) also has shown that the student model learned with the distillation can be more general (Wang et al, 2021 ). Thus, our framework can be a viable solution to train a target domain model with a decent generalization ability.…”
Section: Discussionmentioning
confidence: 98%
“…Contrarily to the above methods that leverage multi-scale spatiotemporal information, in [69] a dynamic equilibrium module is inserted into a 3D-CNN backbone to directly suppress the influence of spatiotemporal variations of actions in video. In another line of research, in [70], a self-knowledge distillation approach is used to boost the performance of baseline 3D-CNN models (3D ResNet-18 and -50) for the task of action recognition.…”
Section: ) Top-down Approachesmentioning
confidence: 99%
“…Ngày nay, các mô hình mạng nơron tích chập Convolutional Neural Network (CNN) ngày càng đạt được nhiều thành tựu nổi bật trong các bài toán về thị giác máy tính và xử lý ảnh như phân lớp ảnh [4], [5], phát hiện đối tượng trong ảnh [6], [7]. Các mô hình CNN nhẹ cũng được quan tâm với nhiều biến thể khác nhau như [8]- [11] nhằm mục đích cho phép các mô hình có thể triển khai trên các thiết bị di động, thiết bị nhúng trong thời gian thực. Trong bài báo này, chúng tôi giới thiệu một mô hình hiện đại dựa trên học sâu để giải quyết bài toán.…”
Section: Giới Thiệuunclassified