CAPformer: Pedestrian Crossing Action Prediction Using Transformer

Lorenzo, Javier; Alonso, Ignacio Parra; Izquierdo, Rubén; Ballardini, Augusto Luis; Hernández, Ana Jesús; Llorca, David Fernández; Sotelo, Miguel Ángel

doi:10.3390/s21175694

Cited by 24 publications

(15 citation statements)

References 29 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Our proposed model suffers from a hard reduction of training samples, compared to [15] which limits its performance on larger datasets, e.g., JAAD all . Nevertheless, our model is on par with CAPformer [17] and outperforms TrouSPI-Net [19] for both subsets.…”

Section: Methodsmentioning

confidence: 86%

“…Furthermore, we compare our model, at different anticipation times, against PCPA [15] which uses fixed and overlapped observations of 0.5 s in the range [2 − 1] s earlier the event. For our purposes, this model is modified to output earlier predictions in the range [4 − 1] s. We also consider CAPformer [17] and TrouSPI-Net [19] models.…”

Section: Methodsmentioning

confidence: 99%

“…More recent works extend this anticipation time to different values with different observation lengths, in addition to developing more advanced models with multiple input features [2,3,4]. In [15], an evaluation benchmark is proposed to tackle this problem, also adopted in [16,17,18,19]. This benchmark focuses on predicting future intentions between 1.0 to 2.0 s earlier the event and uses overlapping windows of 0.5 s as motion history.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Early Pedestrian Intent Prediction via Features Estimation

Osman

Cancelli

Camporese

et al. 2022

2022 IEEE International Conference on Image Processing (ICIP)

View full text Add to dashboard Cite

Anticipating human motion is an essential requirement for autonomous vehicles and robots in order to primary guarantee people's safety. In urban scenarios, they interact with humans, the surrounding environment, and other vehicles relying on several cues to forecast crossing or not crossing intentions. For these reasons, this challenging task is often tackled using both visual and non-visual features to anticipate future actions from 2 s to 1 s earlier the event. Our work primarily aims to revise this standard evaluation protocol to forecast crossing events as early as possible. To this end, we conceive a solution upon an extensively used model for egocentric action anticipation (RU-LSTM), proposing to envision future features, or modalities, that can better infer human intentions using a properly attention-based fusion mechanism. We validate our model against JAAD and PIE datasets and demonstrate that an intent prediction model can benefit from these additional clues for anticipating pedestrians crossing events.

show abstract

Section: Methodsmentioning

confidence: 86%

Section: Methodsmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Early Pedestrian Intent Prediction via Features Estimation

Osman

Cancelli

Camporese

et al. 2022

2022 IEEE International Conference on Image Processing (ICIP)

View full text Add to dashboard Cite

show abstract

“…For pedestrians, body and facial keypoints detectors [35] act as the core for prediction systems. Deep learning approaches use body keypoints to anticipate changes in pedestrian motion patterns [36] and also to predict the intention of crossing from the sidewalk or at a crosswalk [37], [38]. Face key points are of paramount importance for the detection of crossing intention, being eye contact a powerful non-verbal channel of communication often used to express intention to drivers.…”

Section: Overview Of Predictive Perception Systemsmentioning

confidence: 99%

Testing predictive automated driving systems: lessons learned and future recommendations

Gonzalo¹,

Salinas²,

Alonso³

et al. 2022

Preprint

Self Cite

View full text Add to dashboard Cite

Conventional vehicles are certified through classical approaches, where different physical certification tests are set up on test tracks to assess required safety levels. These approaches are well suited for vehicles with limited complexity and limited interactions with other entities as last-second resources. However, these approaches do not allow to evaluate safety with real behaviors for critical and edge cases, nor to evaluate the ability to anticipate them in the mid or long term. This is particularly relevant for automated and autonomous driving functions that make use of advanced predictive systems to anticipate future actions and motions to be considered in the path planning layer. In this paper, we present and analyze the results of physical tests on proving grounds of several predictive systems in automated driving functions developed within the framework of the BRAVE project. Based on our experience in testing predictive automated driving functions, we identify the main limitations of current physical testing approaches when dealing with predictive systems, analyze the main challenges ahead, and provide a set of practical actions and recommendations to consider in future physical testing procedures for automated and autonomous driving functions.

show abstract

“…In this field, 2D poses [2], [3], pedestrian bounding boxes [4], optical flow [5], scene context [6], vehicles speeds [7], trajectories [8], ego-motion of vehicles [7] are utilized in previous works. In the meantime, the deep learning models, such as I3D [5], LSTM/RNN-based temporal models [8], [9], as well as the transformers [10] are adopted in recent years. However, because of the high-mobility of pedestrian, the prediction results of previous works do not approve each other [11], especially for the starting time when the pedestrians show a small scale.…”

Section: Introductionmentioning

confidence: 99%

Deep Virtual-to-Real Distillation for Pedestrian Crossing Prediction

Bai

Fang

et al. 2022

2022 IEEE 25th International Conference on Intelligent Transportation Systems (ITSC)

View full text Add to dashboard Cite

Pedestrian crossing is one of the most typical behavior which conflicts with natural driving behavior of vehicles. Consequently, pedestrian crossing prediction is one of the primary task that influences the vehicle planning for safe driving. However, current methods that rely on the practically collected data in real driving scenes cannot depict and cover all kinds of scene condition in real traffic world. To this end, we formulate a deep virtual to real distillation framework by introducing the synthetic data that can be generated conveniently, and borrow the abundant information of pedestrian movement in synthetic videos for the pedestrian crossing prediction in real data with a simple and lightweight implementation. In order to verify this framework, we construct a benchmark with 4667 virtual videos owning about 745k frames (called Virtual-PedCross-4667), and evaluate the proposed method on two challenging datasets collected in real driving situations, i.e., JAAD and PIE datasets. State-of-the-art performance of this framework is demonstrated by exhaustive experiment analysis. The dataset and code can be downloaded from the website 1 .

show abstract

CAPformer: Pedestrian Crossing Action Prediction Using Transformer

Cited by 24 publications

References 29 publications

Early Pedestrian Intent Prediction via Features Estimation

Early Pedestrian Intent Prediction via Features Estimation

Testing predictive automated driving systems: lessons learned and future recommendations

Deep Virtual-to-Real Distillation for Pedestrian Crossing Prediction

Contact Info

Product

Resources

About