2018
DOI: 10.1007/978-3-030-01252-6_24
|View full text |Cite
|
Sign up to set email alerts
|

Receptive Field Block Net for Accurate and Fast Object Detection

Abstract: Current top-performing object detectors depend on deep CNN backbones, such as ResNet-101 and Inception, benefiting from their powerful feature representations but suffering from high computational costs. Conversely, some lightweight model based detectors fulfil real time processing, while their accuracies are often criticized. In this paper, we explore an alternative to build a fast and accurate detector by strengthening lightweight features using a hand-crafted mechanism. Inspired by the structure of Receptiv… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
644
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
7
2

Relationship

0
9

Authors

Journals

citations
Cited by 1,085 publications
(645 citation statements)
references
References 40 publications
(107 reference statements)
1
644
0
Order By: Relevance
“…Some representative CNN model architectures include AlexNet (Krizhevsky et al, 2012), ZFNet (Zeiler and Fergus, 2014), VGGNet (Simonyan and Zisserman, 2015), GoogLeNet , Inception series (Ioffe and Szegedy, 2015;Szegedy et al, 2017;Szegedy et al, 2016), ResNet , DenseNet (Huang et al, 2017) and SENet (Hu et al, 2018). Also, some researches have been widely explored to further improve the performance of deep learning based methods for object detection, such as feature enhancement (Cai et al, 2016;Cheng et al, 2019;Cheng et al, 2016b;Kong et al, 2016;Liu et al, 2017b), hard negative mining (Lin et al, 2017c;, contextual information fusion (Bell et al, 2016;Gidaris and Komodakis, 2015;Zhu et al, 2015b), modeling object deformations (Mordan et al, 2018;Ouyang et al, 2017;Xu et al, 2017), and so on.…”
Section: Regression-based Methodsmentioning
confidence: 99%
“…Some representative CNN model architectures include AlexNet (Krizhevsky et al, 2012), ZFNet (Zeiler and Fergus, 2014), VGGNet (Simonyan and Zisserman, 2015), GoogLeNet , Inception series (Ioffe and Szegedy, 2015;Szegedy et al, 2017;Szegedy et al, 2016), ResNet , DenseNet (Huang et al, 2017) and SENet (Hu et al, 2018). Also, some researches have been widely explored to further improve the performance of deep learning based methods for object detection, such as feature enhancement (Cai et al, 2016;Cheng et al, 2019;Cheng et al, 2016b;Kong et al, 2016;Liu et al, 2017b), hard negative mining (Lin et al, 2017c;, contextual information fusion (Bell et al, 2016;Gidaris and Komodakis, 2015;Zhu et al, 2015b), modeling object deformations (Mordan et al, 2018;Ouyang et al, 2017;Xu et al, 2017), and so on.…”
Section: Regression-based Methodsmentioning
confidence: 99%
“…According to the theory of Receptive Fields (RFs) in human visual systems [63], [64], the diverse inputs are beneficial to extract distinctive features. However, from Fig.…”
Section: Distinctive Atrous Spatial Pyramid Pooling (Daspp)mentioning
confidence: 99%
“…Firstly, the motivations of vortex pooling and DASPP are different. The proposed DASPP is motivated by the observation that the diverse inputs play an important role in extracting distinctive features [63]. Hence, DASPP takes advantage of the different sizes of pooling operations to generate the diverse inputs.…”
Section: Distinctive Atrous Spatial Pyramid Pooling (Daspp)mentioning
confidence: 99%
“…The authors of [18] point out that one-stage detectors suffer from the class imbalance problem between foregrounds and backgrounds, and propose focal loss which focuses on hard examples rather than easy ones. Furthermore, recent studies [35,20,39] have improved the performance both in accuracy and inference speed maintaining the efficiency of one-stage detectors.…”
Section: Object Detectionmentioning
confidence: 99%