2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022
DOI: 10.1109/cvpr52688.2022.01325
|View full text |Cite
|
Sign up to set email alerts
|

DN-DETR: Accelerate DETR Training by Introducing Query DeNoising

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
147
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 243 publications
(148 citation statements)
references
References 9 publications
1
147
0
Order By: Relevance
“…Decoder. We adopt the transformer encoder-decoder framework as the decoder that shows promising detection results, including DETR [2], Conditional DETR [17], DAB-DETR [14], Deformable DETR [30], DN-DETR [10], and DINO [28]. Group DETR [3] provides further progress in improving the training convergence speed and the detection performance of various DETR variants.…”
Section: Architecturementioning
confidence: 99%
“…Decoder. We adopt the transformer encoder-decoder framework as the decoder that shows promising detection results, including DETR [2], Conditional DETR [17], DAB-DETR [14], Deformable DETR [30], DN-DETR [10], and DINO [28]. Group DETR [3] provides further progress in improving the training convergence speed and the detection performance of various DETR variants.…”
Section: Architecturementioning
confidence: 99%
“…Unlike previous methods, our proposed CFT does not estimate depth and completely removes camera parameters. Inspired by advanced vision transformers [29,24,20], CFT decouples the positional and content embedding in the position-aware enhancement, and further mines richer 3D information, thereby effectively learning stable BEV representations. Instead of point-wise attention with camera guidance or redundant global attention, a view-attention is presented to reduce the computational and accelerates the establishment of transformation relations.…”
Section: Related Workmentioning
confidence: 99%
“…As a new paradigm for object detection, detection transformer (DETR) [13] eliminates the need for hand-designed components and shows promising performance compared with most classical detectors based on convolutional architectures due to the processing of global information performed by the self-attention [14]. In the ensuing years, many improved DETR-like methods [15][16][17] have been proposed to address the problems that slow the training convergence of DETR and the meaning of queries. Among them, DETR with improved denoising anchor boxes (DINO) [18] became a new state-of-the-art approach on COCO 2017 [19], proving that transformer-based object-detection models can also achieve superior performance.…”
Section: Introductionmentioning
confidence: 99%
“…Compared with the well-established CNN-based detectors, how to develop efficient domain adaptation methods to enhance the cross-domain performance of DETR-like detectors remains rarely explored. The design draws on DN-DETR [17], DAB-DETR [16], and deformable DETR [15], with DINO achieving an exceptional result on public datasets. However, as with previous object detectors, it cannot be directly applied to new scenarios when variations in environmental conditions change, which results in significant performance degradation.…”
Section: Introductionmentioning
confidence: 99%