2022
DOI: 10.1109/jsen.2022.3208200
|View full text |Cite
|
Sign up to set email alerts
|

EMA-VIO: Deep Visual–Inertial Odometry With External Memory Attention

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
9
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
2

Relationship

0
6

Authors

Journals

citations
Cited by 13 publications
(15 citation statements)
references
References 23 publications
0
9
0
Order By: Relevance
“…Compared with VINS-Mono [23], the advantage of our proposed approach not only exists in fusion stage but also in front-end feature extraction which we mentioned in Section II.A. Besides, the improvement compared with EMA-VIO [1] which also deploys Transformer-based approach for fusion, is possibly that the multi-layer fusion module aggregates the LiDAR and inertial data at different scale [32] [31].…”
Section: Positioning Results On Kitti Datasetmentioning
confidence: 99%
See 4 more Smart Citations
“…Compared with VINS-Mono [23], the advantage of our proposed approach not only exists in fusion stage but also in front-end feature extraction which we mentioned in Section II.A. Besides, the improvement compared with EMA-VIO [1] which also deploys Transformer-based approach for fusion, is possibly that the multi-layer fusion module aggregates the LiDAR and inertial data at different scale [32] [31].…”
Section: Positioning Results On Kitti Datasetmentioning
confidence: 99%
“…Inspired by ViLT [28], the Transformer [15] architecture has shown impressive performance in the field of multi-modal fusion, not limited to odometry estimation tasks but also in navigation [29], semantic segmentation, and object detection tasks [30]. In EMA-VIO [1] and AFT-VO [2], the Transformer architecture is used to fuse multiple modalities, and through challenging real-world experiments, it has shown higher accuracy and robustness than some soft mask-based approaches. But these works did not consider the effect of fusion position, and the Transformer is used as a black box without interpretability to explain how two modalities interact and fusion inside the Transformer architecture.…”
Section: Learning-based Multi-modal Fusion For Odometry Estimationmentioning
confidence: 99%
See 3 more Smart Citations