Prediction-Guided Distillation for Dense Object Detection

Yang, Chenhongyi; Ochal, Mateusz; Storkey, Amos; Crowley, Elliot J.

doi:10.1007/978-3-031-20077-9_8

Cited by 15 publications

(4 citation statements)

References 35 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Traditional object detection entails the task of identifying and localizing all objects within an image or video frame, encompassing the simultaneous duties of classification and spatial localization. Recently, the successful application of knowledge distillation in traditional object detection has garnered attention [2,16,30,33,36]. In the pursuit of compact and efficient object detection networks, [3] have seamlessly integrated knowledge distillation, achieving heightened efficiency with minimal accuracy trade-offs.…”

Section: Knowledge Distillation In Object Detectionmentioning

confidence: 99%

Industrial effects of latest China’s medical reform on drug prices

Yang

2021

Peking University Law Journal

View full text Add to dashboard Cite

Accurately detecting active objects undergoing state changes is essential for comprehending human interactions and facilitating decision-making. The existing methods for active object detection (AOD) primarily rely on visual appearance of the objects within input, such as changes in size, shape and relationship with hands. However, these visual changes can be subtle, posing challenges, particularly in scenarios with multiple distracting no-change instances of the same category. We observe that the state changes are often the result of an interaction being performed upon the object, thus propose to use informed priors about object related plausible interactions (including semantics and visual appearance) to provide more reliable cues for AOD. Specifically, we propose a knowledge aggregation procedure to integrate the aforementioned informed priors into oracle queries within the teacher decoder, offering more object affordance commonsense to locate the active object. To streamline the inference process and reduce extra knowledge inputs, we propose a knowledge distillation approach that encourages the student decoder to mimic the detection capabilities of the teacher decoder using the oracle query by replicating its predictions and attention. Our proposed framework achieves state-of-the-art performance on four datasets, namely Ego4D, Epic-Kitchens, MECCANO, and 100DOH, which demonstrates the effectiveness of our approach in improving AOD. The code and models are available at https://github.com/idejie/KAD.git.

show abstract

Section: Knowledge Distillation In Object Detectionmentioning

confidence: 99%

Industrial effects of latest China’s medical reform on drug prices

Yang

2021

Peking University Law Journal

View full text Add to dashboard Cite

show abstract

“…Therefore, we would like to use the knowledge of features of defects in gold tools and insulators as a focus for the instructor network to guide the student network. Therefore, the PGW (Prediction-Guided Weighting) (Yang et al, 2022) module is introduced to improve the prospect distillation region. And the PGW module is precisely concentrated in the first k feature pixels with the highest mass fraction in the prospect region.…”

Section: Knowledge Distillation Guided By Key Area Scoringmentioning

confidence: 99%

“…At present, some researches have applied the knowledge distillation method to the field of electric power. Literature (Yang et al, 2022) proposes a compression and integration application method based on knowledge distillation. In this method, the Detr model is used to identify the initial target, and the Deformable Detr algorithm is used to compress the Detr model, so that the compression ratio reaches 87.5% and the target detection accuracy is maintained at a high level, and the effective integrated application of the target detection model in the substation inspection robot body is realized.…”

Section: Introductionmentioning

confidence: 99%

Defect detection method for key area guided transmission line components based on knowledge distillation

Zhao,

Lv,

et al. 2023

Front. Energy Res.

View full text Add to dashboard Cite

Introduction: The aim of this paper is to address the problem of the limited number of defect images for both metal tools and insulators, as well as the small range of defect features.Methods: A defect detection method for key area-guided transmission line components based on knowledge distillation is proposed. First, the PGW (Prediction-Guided Weighting) module is introduced to improve the foreground target distillation region, and the distillation range is precisely concentrated in the position of the first k feature pixels with the highest quality score in the form of a mask. The feature knowledge of defects of hardware and insulators is used as the focus for the teacher network to guide the student network. Then, the GcBlock module is used to capture the relationship between the target defects of the hardware and the transmission lines in the background, and the overall relationship information of the image is used to promote the students’ network to learn the teacher’s network perception ability of the relationship information. Finally, the classification task mask and regression task mask generated by the PGW module, combined with the overall image relationship loss, form a distillation loss function for network training to improve the accuracy of students’ network detection accuracy.Results and Discussion: The effectiveness of the proposed method is verified by using self-build metal fittings and insulator defect data sets. The experimental results show that the student network mAP_50 (Mean Average Precision at 50) in the Faster R-CNN model with the knowledge distillation algorithm added in this paper increases by 8.44%, and the RetinaNet model increases by 2.6%. The Cascade R-CNN model improved by 5.28%.

show abstract

“…Knowledge Distillation, first introduced by Bucila et al [11] and popularized by [12], has served as a successful strategy for achieving a better trade-off between performance and efficiency of deep neural networks by using the knowledge of a more complex network (the teacher) to assist the training of a lighter network (the student). Methods based on knowledge distillation have greatly improved the accuracy of lightweight networks, performing tasks; such as image classification [13]- [17], object detection [18]- [20], and face recognition [21]- [23]. The knowledge distilled in the pioneering work of [12] for the task of image classification provided soft labels from a heavy teacher network with more beneficial information (e.g., intra-class similarity and inter-class difference), than the hard labels originally provided to the small network in the form of one-hot class label vectors.…”

Section: Introductionmentioning

confidence: 99%

Class Attention Map Distillation for Efficient Semantic Segmentation

Bavandpour

Kasaei

2020

2020 International Conference on Machine Vision and Image Processing (MVIP)

View full text Add to dashboard Cite

In recent years, deep neural networks have achieved remarkable accuracy in computer vision tasks. With inference time being a crucial factor, particularly in dense prediction tasks such as semantic segmentation, knowledge distillation has emerged as a successful technique for improving the accuracy of lightweight student networks. The existing methods often neglect the information in channels and among different classes.To overcome these limitations, this paper proposes a novel method called Inter-Class Similarity Distillation (ICSD) for the purpose of knowledge distillation. The proposed method transfers high-order relations from the teacher network to the student network by independently computing intra-class distributions for each class from network outputs. This is followed by calculating inter-class similarity matrices for distillation using KL divergence between distributions of each pair of classes. To further improve the effectiveness of the proposed method, an Adaptive Loss Weighting (ALW) training strategy is proposed. Unlike existing methods, the ALW strategy gradually reduces the influence of the teacher network towards the end of training process to account for errors in teacher's predictions. Extensive experiments conducted on two well-known datasets for semantic segmentation, Cityscapes and Pascal VOC 2012, validate the effectiveness of the proposed method in terms of mIoU and pixel accuracy. The proposed method outperforms most of existing knowledge distillation methods as demonstrated by both quantitative and qualitative evaluations. Code is available at: https://github.com/AmirMansurian/AICSD

show abstract

Prediction-Guided Distillation for Dense Object Detection

Cited by 15 publications

References 35 publications

Industrial effects of latest China’s medical reform on drug prices

Industrial effects of latest China’s medical reform on drug prices

Defect detection method for key area guided transmission line components based on knowledge distillation

Class Attention Map Distillation for Efficient Semantic Segmentation

Contact Info

Product

Resources

About