2019
DOI: 10.1109/access.2019.2944792
|View full text |Cite
|
Sign up to set email alerts
|

Multi-Task Deep Learning for Pedestrian Detection, Action Recognition and Time to Cross Prediction

Abstract: A pedestrian detection system is a crucial component of advanced driver assistance systems since it contributes to road flow safety. The safety of traffic participants could be significantly improved if these systems could also predict and recognize pedestrian's actions, or even estimate the time, for each pedestrian, to cross the street. In this paper, we focus not only on pedestrian detection and pedestrian action recognition but also on estimating if the pedestrian's action presents a risky situation accord… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
18
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
7
2

Relationship

0
9

Authors

Journals

citations
Cited by 37 publications
(19 citation statements)
references
References 23 publications
1
18
0
Order By: Relevance
“…For computing the gait information we have employed the pose extraction algorithm proposed by [76], [77], [78]. AlexNet is trained for gait (walking / Method Accuracy SVM and gait information [46] 88.75% LSTM + bounding box information [79] 80.5% ACF pedestrian detector [6] [46] and the model is modified in order to provide features on top of which a SVM classifier is trained. We have used as context the road information provided by the semantic segmentation module.…”
Section: Cross Action Recognition Evaluationmentioning
confidence: 99%
“…For computing the gait information we have employed the pose extraction algorithm proposed by [76], [77], [78]. AlexNet is trained for gait (walking / Method Accuracy SVM and gait information [46] 88.75% LSTM + bounding box information [79] 80.5% ACF pedestrian detector [6] [46] and the model is modified in order to provide features on top of which a SVM classifier is trained. We have used as context the road information provided by the semantic segmentation module.…”
Section: Cross Action Recognition Evaluationmentioning
confidence: 99%
“…Similarly, Saleh et al [52] propose to predict the intended actions of pedestrians based on a spatio-temporal DenseNet model. Pop et al [8] propose to extract spatial information with convolutive layers, then consider temporal dynamics with recurrent layers and propose a new metric for pedestrians dynamics evaluation: the time to cross (TTC) prediction. Some works are based on state-of-the-art generative methods in deep-learning, focusing on the future representation of the action, and then classify it in its globality: Gujjar et al [53] and Chaabane et al [54] process the crossing actions classification by feeding the predicted frames of their future frame prediction auto-encoder network into a classification network.…”
Section: Pedestrian Intention Predictionmentioning
confidence: 99%
“…Human action recognition applied to video is a difficult research topic due to the great variation and complexity of the input data. Currently, the main modalities used for these tasks include RGB videos in their entirety [1][2][3][4], optical flow [5][6][7][8] and skeleton form modeling [9][10][11][12]. The latter requires, prior to the action classification, an approach to estimate the human pose [13].…”
Section: Introductionmentioning
confidence: 99%
“…In addition, multi-task deep learning [10], [23], [25]- [27] has attracted the attention of researchers due to its potential to boost the performance of each individual task and improve the efficiency of the total network. In RBNet [10], a Bayesian model is implemented, and the RBNet can learn to estimate the road and the road boundary simultaneously.…”
Section: Introdctionmentioning
confidence: 99%