2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2022
DOI: 10.1109/iros47612.2022.9981795
|View full text |Cite
|
Sign up to set email alerts
|

Visual-Inertial Multi-Instance Dynamic SLAM with Object-level Relocalisation

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
7
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 12 publications
(13 citation statements)
references
References 32 publications
0
7
0
Order By: Relevance
“…Similarly, Dynamic-VINS [19] refines 2D bounding boxes generated from YOLOv3 [20] and removes feature points of dynamic objects with a recourse-limited platform. Ren et al [5] proposes a dense RGB-D-inertial SLAM system that can track and relocalise multiple dynamic objects with the aid of instance segmentation from Mask R-CNN [12]. In contrast, DynaVINS [8] can remove undefined dynamic objects that are dominant in the visual input using the camera motion priors from a low-cost IMU.…”
Section: B Proprioception-aided Slammentioning
confidence: 99%
See 1 more Smart Citation
“…Similarly, Dynamic-VINS [19] refines 2D bounding boxes generated from YOLOv3 [20] and removes feature points of dynamic objects with a recourse-limited platform. Ren et al [5] proposes a dense RGB-D-inertial SLAM system that can track and relocalise multiple dynamic objects with the aid of instance segmentation from Mask R-CNN [12]. In contrast, DynaVINS [8] can remove undefined dynamic objects that are dominant in the visual input using the camera motion priors from a low-cost IMU.…”
Section: B Proprioception-aided Slammentioning
confidence: 99%
“…The dynamic objects can, therefore, be removed as outliers during robust camera tracking. On the other hand, when the categories of dynamic objects are predefined, the regions containing these objects can be directly detected using deep learning methods [5]. In the scenario of long-term large occlusion, the majority of camera view is occluded for the majority of time frames.…”
Section: Introductionmentioning
confidence: 99%
“…SLAM systems use semantics for better pose estimation or re-localization [2], [3] or to work in dynamic scenes [2], [3]. It can facilitate downstream tasks such as robotic navigation [4] or augmented reality (AR) experiences [5].…”
Section: Introductionmentioning
confidence: 99%
“…Real-time semantic mapping methods usually rely on 2D convolutional neural networks with optional 3D postprocessing (2D-3D networks) to annotate incoming images with semantics, using back-projection to lift the semantic labels to the 3D map [6], [3], [7], [5], [8], [1], while 1 Slamcore Ltd. 2 University College London recent FP-Conv [7] or SVCNN [6] also rely on lightweight 3D post-processing. 2D-3D networks repetitively process images with similar visual content, solving 2D semantic segmentation from scratch for each image, which may be redundant [9]; lack multi-view consistency in 2D labels [10]; suffer from occlusions or object scale uncertainty [11].…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation