2017 IEEE International Conference on Computer Vision (ICCV) 2017
DOI: 10.1109/iccv.2017.69
|View full text |Cite
|
Sign up to set email alerts
|

Recurrent Scale Approximation for Object Detection in CNN

Abstract: Since convolutional neural network (CNN) lacks an inherent mechanism to handle large scale variations, we always need to compute feature maps multiple times for multiscale object detection, which has the bottleneck of computational cost in practice. To address this, we devise a recurrent scale approximation (RSA) to compute feature map once only, and only through this map can we approximate the rest maps on other levels. At the core of RSA is the recursive rolling out mechanism: given an initial map at a parti… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
56
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
5
3
2

Relationship

3
7

Authors

Journals

citations
Cited by 97 publications
(56 citation statements)
references
References 36 publications
0
56
0
Order By: Relevance
“…Since state-of-the-art 2D detectors [20,17,13,16,15] can provide reliable 2D bounding boxes for objects, several works use 2D box as a prior to reduce the search region of 3D box [1,18]. [1] uses a CNN to predict the parts coordinates, visibility and template similarity based on the 2D box, and match the best corresponding 3D template.…”
Section: Related Workmentioning
confidence: 99%
“…Since state-of-the-art 2D detectors [20,17,13,16,15] can provide reliable 2D bounding boxes for objects, several works use 2D box as a prior to reduce the search region of 3D box [1,18]. [1] uses a CNN to predict the parts coordinates, visibility and template similarity based on the 2D box, and match the best corresponding 3D template.…”
Section: Related Workmentioning
confidence: 99%
“…It treats aligned and unaligned images separately, thereby using a context-switching technique for a given input image. Images are aligned using the co-ordinates provided with the dataset along with MTCNN and Recurrent Scale Approximation (RSA) [41]. Features learned by the CNNs are directly used for classification.…”
Section: (Iii) Deep Disguise Recognizer Network (Ddrnet) [27]mentioning
confidence: 99%
“…On AFW, our algorithm achieves an AP of 99.94% us- ing RPN+S 2 AP . On FDDB, RPN+S 2 AP recalls 93.59% faces with 50 false positive higher than [19] which also utilizes the scale information and on MALF our method recalls 77.92% faces with zeros false positive. Note that the shape and scale definition of bounding box on each benchmark varies.…”
Section: Comparing With State-of-the-artmentioning
confidence: 89%