Action Detection from a Robot-Car Perspective

Fontana, Valentina; Singh, Gurkirt; Akrigg, Stephen; Maio, Manuele Di; Saha, Suman Kumar; Cuzzolin, Fabio

doi:10.48550/arxiv.1807.11332

Cited by 3 publications

(4 citation statements)

References 24 publications

(70 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Also, it should be able to detect and anticipate road users activities such as moving away, moving towards, crossing the road, and anomalous events in real-time to adjust the speed and handle the situation. Therefore, spatio-temporal action localization algorithms need to be developed to guarantee the safety of self-driving cars [205]. Yao et al [206] proposed a traffic anomaly detection with a when-where-what pipeline to detect, localize, and recognize anomalous events from egocentric videos.…”

Section: Action Detection In Autonomous Drivingmentioning

confidence: 99%

Deep Learning-based Action Detection in Untrimmed Videos: A Survey

Tian¹

2021

Preprint

View full text Add to dashboard Cite

Understanding human behavior and activity facilitates advancement of numerous real-world applications, and is critical for video analysis. Despite the progress of action recognition algorithms in trimmed videos, the majority of real-world videos are lengthy and untrimmed with sparse segments of interest. The task of temporal activity detection in untrimmed videos aims to localize the temporal boundary of actions and classify the action categories. Temporal activity detection task has been investigated in full and limited supervision settings depending on the availability of action annotations. This paper provides an extensive overview of deep learning-based algorithms to tackle temporal action detection in untrimmed videos with different supervision levels including fully-supervised, weakly-supervised, unsupervised, self-supervised, and semi-supervised. In addition, this paper also reviews advances in spatio-temporal action detection where actions are localized in both temporal and spatial dimensions. Moreover, the commonly used action detection benchmark datasets and evaluation metrics are described, and the performance of the state-of-the-art methods are compared. Finally, real-world applications of temporal action detection in untrimmed videos and a set of future directions are discussed.

show abstract

Section: Action Detection In Autonomous Drivingmentioning

confidence: 99%

Deep Learning-based Action Detection in Untrimmed Videos: A Survey

Tian¹

2021

Preprint

View full text Add to dashboard Cite

show abstract

“…Action Recognition A variety of datasets have been introduced for action recognition with a single action label [24,48,20,32,21] and multiple action labels [43,59,4] in videos. Recently released datasets such as AVA [16], READ [14], and EPIC-KITCHENS [12] contain actions with corresponding localization around a person or object. Our TITAN dataset is similar to AVA in the sense that it provides spatio-temporal localization for each agent with multiple action labels.…”

Section: Datasetsmentioning

confidence: 99%

“…We concatenate the embedded features through (11)(12), which are given from the hidden states of the bounding box encoder GRU (6), the hidden states of the ego encoder GRU (7), encoded interaction (10) and action embedding (3). We encode all information for 10 observation time steps from (14). We decode the future locations using decoder GRU for 20 future time steps (20).…”

Section: Future Object Localizationmentioning

confidence: 99%

TITAN: Future Forecast using Action Priors

Malla¹,

Dariush²,

Choi³

2020

Preprint

View full text Add to dashboard Cite

We consider the problem of predicting the future trajectory of scene agents from egocentric views obtained from a moving platform. This problem is important in a variety of domains, particularly for autonomous systems making reactive or strategic decisions in navigation. In an attempt to address this problem, we introduce TITAN (Trajectory Inference using Targeted Action priors Network), a new model that incorporates prior positions, actions, and context to forecast future trajectory of agents and future ego-motion. In the absence of an appropriate dataset for this task, we created the TITAN dataset that consists of 700 labeled video-clips (with odometry) captured from a moving vehicle on highly interactive urban traffic scenes in Tokyo. Our dataset includes 50 labels including vehicle states and actions, pedestrian age groups, and targeted pedestrian action attributes that are organized hierarchically corresponding to atomic, simple/complex-contextual, transportive, and communicative actions. To evaluate our model, we conducted extensive experiments on the TITAN dataset, revealing significant performance improvement against baselines and state-of-the-art algorithms. We also report promising results from our Agent Importance Mechanism (AIM), a module which provides insight into assessment of perceived risk by calculating the relative influence of each agent on the future ego-trajectory. The dataset is available at https://usa.honda-ri.com/titan

show abstract

“…Second, by splitting the process in two parts, i.e., the tagging and the scenario mining, the scenario mining can be applied to different types of data (e.g., data from a vehicle [16] or a measurement unit above the road [12], [17]). It is also possible to have manually tagged data, e.g., see [18]. Third, our approach is easily scalable because additional types of scenarios can be mined by describing them as a combination of (sequential) tags.…”

Section: Introductionmentioning

confidence: 99%

Real-World Scenario Mining for the Assessment of Automated Vehicles

Gelder

Manders²,

Grappiolo³

et al. 2020

2020 IEEE 23rd International Conference on Intelligent Transportation Systems (ITSC)

View full text Add to dashboard Cite

Scenario-based methods for the assessment of Automated Vehicles (AVs) are widely supported by many players in the automotive field. Scenarios captured from real-world data can be used to define the scenarios for the assessment and to estimate their relevance. Therefore, different techniques are proposed for capturing scenarios from real-world data. In this paper, we propose a new method to capture scenarios from real-world data using a two-step approach. The first step consists in automatically labeling the data with tags. Second, we mine the scenarios, represented by a combination of tags, based on the labeled tags. One of the benefits of our approach is that the tags can be used to identify characteristics of a scenario that are shared among different type of scenarios. In this way, these characteristics need to be identified only once. Furthermore, the method is not specific for one type of scenario and, therefore, it can be applied to a large variety of scenarios. We provide two examples to illustrate the method. This paper is concluded with some promising future possibilities for our approach, such as automatic generation of scenarios for the assessment of automated vehicles.

show abstract

Action Detection from a Robot-Car Perspective

Cited by 3 publications

References 24 publications

Deep Learning-based Action Detection in Untrimmed Videos: A Survey

Deep Learning-based Action Detection in Untrimmed Videos: A Survey

TITAN: Future Forecast using Action Priors

Real-World Scenario Mining for the Assessment of Automated Vehicles

Contact Info

Product

Resources

About