2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2019
DOI: 10.1109/iros40897.2019.8967730
|View full text |Cite
|
Sign up to set email alerts
|

Forecasting Time-to-Collision from Monocular Video: Feasibility, Dataset, and Challenges

Abstract: We explore the possibility of using a single monocular camera to forecast the time to collision between a suitcaseshaped robot being pushed by its user and other nearby pedestrians. We develop a purely image-based deep learning approach that directly estimates the time to collision without the need of relying on explicit geometric depth estimates or velocity information to predict future collisions. While previous work has focused on detecting immediate collision in the context of navigating Unmanned Aerial Ve… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
23
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
3
3
1

Relationship

1
6

Authors

Journals

citations
Cited by 29 publications
(23 citation statements)
references
References 35 publications
0
23
0
Order By: Relevance
“…3D object detection from a single image (monocular vision) is an indispensable part of future autonomous driving [51] and robot vision [28] because a single cheap onboard camera is readily available in most modern cars. Successful modern day methods for 3D object detection heavily rely on 3D sensors, such as a depth camera, a stereo camera or a laser scanner (i.e., LiDAR), which can provide explicit 3D information about the entire scene.…”
Section: Introductionmentioning
confidence: 99%
“…3D object detection from a single image (monocular vision) is an indispensable part of future autonomous driving [51] and robot vision [28] because a single cheap onboard camera is readily available in most modern cars. Successful modern day methods for 3D object detection heavily rely on 3D sensors, such as a depth camera, a stereo camera or a laser scanner (i.e., LiDAR), which can provide explicit 3D information about the entire scene.…”
Section: Introductionmentioning
confidence: 99%
“…Our method struggles more with predictions for further pedestrians (Time-to-Collision > 3 s), however, produces slightly smoother and more accurate results for closer pedestrians in the Near-Collision test set (Figure 12). [25]. It is a multi-stream that takes 6 consecutive RGB frames as input through the same convolutional network.…”
Section: Comparison With the Near-collision Networkmentioning
confidence: 99%
“…Figure 6. Three different test videos from the Near-Collision dataset with their labeled Time-to-Collision ground truth and their predicted time[25].…”
mentioning
confidence: 99%
See 2 more Smart Citations