2017 IEEE International Conference on Image Processing (ICIP) 2017
DOI: 10.1109/icip.2017.8296412
|View full text |Cite
|
Sign up to set email alerts
|

Multi-glimpse LSTM with color-depth feature fusion for human detection

Abstract: With the development of depth cameras such as Kinect and Intel Realsense, RGB-D based human detection receives continuous research attention due to its usage in a variety of applications. In this paper, we propose a new Multi-Glimpse LSTM (MG-LSTM) network, in which multi-scale contextual information is sequentially integrated to promote the human detection performance. Furthermore, we propose a feature fusion strategy based on our MG-LSTM network to better incorporate the RGB and depth information. To the bes… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
10
0

Year Published

2018
2018
2022
2022

Publication Types

Select...
5

Relationship

0
5

Authors

Journals

citations
Cited by 12 publications
(10 citation statements)
references
References 20 publications
(26 reference statements)
0
10
0
Order By: Relevance
“…However, the approach introduced in this work does not hinge on scene-specific a priori knowledge and provides an approximation to the full posterior distribution. In contrast to recent data-driven CNN architectures [22], [28]- [30], [33] our method requires no training data and the detection confidence can be quantified more precisely by approximating the posterior distribution. To the best of our knowledge, variational mean-field inference in combination with a generative scene model has not yet been applied to the problem of people detection in overlapping depth images.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…However, the approach introduced in this work does not hinge on scene-specific a priori knowledge and provides an approximation to the full posterior distribution. In contrast to recent data-driven CNN architectures [22], [28]- [30], [33] our method requires no training data and the detection confidence can be quantified more precisely by approximating the posterior distribution. To the best of our knowledge, variational mean-field inference in combination with a generative scene model has not yet been applied to the problem of people detection in overlapping depth images.…”
Section: Discussionmentioning
confidence: 99%
“…In contrast to our proposed method, those approaches focus on integrated systems counting the number of persons crossing a certain virtual line, providing people detection only implicitly and in a rather small area. Recent CNN architectures [28]- [30] are successfully applied to single view depth image people detection leveraging many labeled images for training. Since in our top-view setup position changes of people lead to drastically varying appearances (compared to the classical frontal or profile view), those approaches need to be re-trained with a domain-specific large-scale data set.…”
Section: B Depth-based Approachesmentioning
confidence: 99%
“…Only a few detectors have considered the use of both RGB and depth (RGB-D) images as inputs to their networks [4,5], which are more robust against illumination and texture variations. In [4], a ResNet detector was used to detect upper body parts in an operating room.…”
Section: Person Detection Using Deep Learning Approaches With Rgb Andmentioning
confidence: 99%
“…In [5], a long short-term memory (LSTM) network was used to detect head-tops. The first layer employed the headtop detection technique presented in [18], where for each possible head-top pixel, a set of bounding boxes were generated from both RGB and depth images.…”
Section: Person Detection Using Deep Learning Approaches With Rgb Andmentioning
confidence: 99%
See 1 more Smart Citation