2018 21st International Conference on Information Fusion (FUSION) 2018
DOI: 10.23919/icif.2018.8455494
|View full text |Cite
|
Sign up to set email alerts
|

Convolutional Neural Networks for Aerial Multi-Label Pedestrian Detection

Abstract: The low resolution of objects of interest in aerial images makes pedestrian detection and action detection extremely challenging tasks. Furthermore, using deep convolutional neural networks to process large images can be demanding in terms of computational requirements. In order to alleviate these challenges, we propose a two-step, yes and no question answering framework to find specific individuals doing one or multiple specific actions in aerial images. First, a deep object detector, Single Shot Multibox Det… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
11
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
4
3

Relationship

1
6

Authors

Journals

citations
Cited by 16 publications
(11 citation statements)
references
References 17 publications
0
11
0
Order By: Relevance
“…We propose a framework in which, first, an SSD [3], which has shown promising performances in the aerial image object detection literature [2] and [17], generates a number of objects of interest proposals for an input aerial image. These proposals might contain vehicle, background, or other objects.…”
Section: Proposed Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…We propose a framework in which, first, an SSD [3], which has shown promising performances in the aerial image object detection literature [2] and [17], generates a number of objects of interest proposals for an input aerial image. These proposals might contain vehicle, background, or other objects.…”
Section: Proposed Methodsmentioning
confidence: 99%
“…In order to alleviate the challenge of objects occupying small number of pixels, we split the problem into two sub- problems [2]. We first assume that a deep detector like Single Shot Multibox Detector (SSD) [3] extracts objects or areas of interest, and second, we use a deep convolutional network to recognize which of the extracted objects of interest are also the vehicles we wish to detect.…”
Section: Introductionmentioning
confidence: 99%
“…Their work is limited to singlelabel HAR, since their detection algorithm, i.e., the Single Shot multi-box Detector (SSD) [18], cannot handle multiple labels. In [12], the authors use a VGG neural network to extract visual features from objects of interest. They subsequently concatenate these features with a bag-of-words representation by using the Visual Question Answering technique [19].…”
Section: Related Workmentioning
confidence: 99%
“…The 3D-Conv layer outputs twelve 3D feature maps, C = {C (1) , C (2) , ...C (12) }, one for each fixed 3D filter. Each feature map in C has L 2D feature maps of spatial dimensions W × H, where L is the number of frames of the input action tube, as defined before.…”
Section: Proposed Dronecaps Architecturementioning
confidence: 99%
See 1 more Smart Citation