2022
DOI: 10.48550/arxiv.2206.10118
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

HOPE: Hierarchical Spatial-temporal Network for Occupancy Flow Prediction

Abstract: In this report, we introduce our solution to the Occupancy and Flow Prediction challenge in the Waymo Open Dataset Challenges at CVPR 2022. We have developed a novel hierarchical spatial-temporal network featured with spatial-temporal encoders, a multi-scale aggregator enriched with latent variables, and a recursive hierarchical 3D decoder. We use multiple losses including focal loss and modified flow trace loss to efficiently guide the training process. Our method achieves a Flow-Grounded Occupancy AUC of 0.8… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
1
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(3 citation statements)
references
References 22 publications
0
1
0
Order By: Relevance
“…Refs. [29,106,107] jointly predict the occupancy grid map and occupancy flow field of the region of interest based on the model's understanding of the scene. Mahjourian et al [29] use a similar input characterization and encoding process as [42] but apply a feature pyramid network (FPN) [100] to fuse multi-scale features during decoding.…”
Section: Occupancy Flow Predictionmentioning
confidence: 99%
See 1 more Smart Citation
“…Refs. [29,106,107] jointly predict the occupancy grid map and occupancy flow field of the region of interest based on the model's understanding of the scene. Mahjourian et al [29] use a similar input characterization and encoding process as [42] but apply a feature pyramid network (FPN) [100] to fuse multi-scale features during decoding.…”
Section: Occupancy Flow Predictionmentioning
confidence: 99%
“…Liu et al [106] consider the extraction of historical motion agent features and the global scene feature and then use a Swin-transformer [108] to consider the fusion of objectlevel features and global scene-level features. Hu et al [107] define a hierarchical spatialtemporal network with multi-scale feature fusion to encode scene feature maps in both spatial and temporal dimensions.…”
Section: Occupancy Flow Predictionmentioning
confidence: 99%
“…While achieving agent-wise accuracy, MATP introduces exponential computations and trajectory-wise inconsistency. Dense predictions directly estimate the future distribution of agents jointly from ego-centered occupancy [9], [20], [21]. A notable issue is the loss of agent-wise tractability.…”
Section: A Predictions and Planning In Adsmentioning
confidence: 99%