2019
DOI: 10.1007/s00138-019-01017-9
|View full text |Cite
|
Sign up to set email alerts
|

Gated bidirectional feature pyramid network for accurate one-shot detection

Abstract: Despite recent advances in machine learning, it is still challenging to realize real-time and accurate detection in images. The recently proposed StairNet detector (Sanghyun et al. in Proceedings of winter conference on applications of computer vision (WACV), 2018), one of the strongest one-stage detectors, tackles this issue by using a SSD in conjunction with a top-down enrichment module. However, the StairNet approach misses the finer localization information which can be obtained from the lower layer and la… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
18
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
7
1
1

Relationship

0
9

Authors

Journals

citations
Cited by 23 publications
(18 citation statements)
references
References 54 publications
0
18
0
Order By: Relevance
“…In order to verify the effectiveness of the cascaded convolutional neural network model and training method designed in this paper, using the same training and test data, five target detection methods based on convolutional neural networks were compared: SSD300 [31], YoLoV2 [32], FRCNN [33], RetinaNet [34] and MTCNN algorithm [35]. SSD300 uses VGG16 [36] as the backbone network, and YoLov2, FRCNN and RetinaNet use ResNet50 as the backbone network.…”
Section: Figure 8ap Results For Different Image Typesmentioning
confidence: 99%
“…In order to verify the effectiveness of the cascaded convolutional neural network model and training method designed in this paper, using the same training and test data, five target detection methods based on convolutional neural networks were compared: SSD300 [31], YoLoV2 [32], FRCNN [33], RetinaNet [34] and MTCNN algorithm [35]. SSD300 uses VGG16 [36] as the backbone network, and YoLov2, FRCNN and RetinaNet use ResNet50 as the backbone network.…”
Section: Figure 8ap Results For Different Image Typesmentioning
confidence: 99%
“…In [10], Woo et al proposed a gated bidirectional feature pyramid network to tackle this issue by using a gating module on the SSD frame. The gate module is not easy to be trained.…”
Section: Introductionmentioning
confidence: 99%
“…It is bidirectional and can fuse both deep and shallow features towards more effective and robust object detection. Due to the "residual" nature, similar to ResNet [5], it can be easily trained and integrated into different backbones (even deeper or lighter) than other bi-directional methods [7], [10]. Besides this structure, a new BiFusion module is proposed to let the "residual" features form a compact representation that brings more accurate localization information into each prediction layer so that not only the results on small-sized object detection but also large/medium-sized ones are improved.…”
Section: Introductionmentioning
confidence: 99%
“…Both these approaches rely on assimilating information via their pixel-connectivity to improve feature representations. For scale relations, many efforts have been made on fusing features across scales to alleviate the discrepancy of feature maps from different levels of bottom-up hierarchy and feature scale-space, including top-down information flow [15, 40, 54], an extra bottom-up information path [31,43,68], multiple hourglass structures [46,81], concatenating features from different layers [4,20,38,59] or different tasks [52], gradual multi-stage local information fusions [58,75], pyramid convolutions [67], etc. Even though standard design principles for scale relations are emerging for ConvNet architectures, the problem is far from being solved.…”
Section: Introductionmentioning
confidence: 99%