2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2021
DOI: 10.1109/iccv48922.2021.01559
|View full text |Cite
|
Sign up to set email alerts
|

Visual Graph Memory with Unsupervised Representation for Visual Navigation

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
18
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
2
2

Relationship

1
7

Authors

Journals

citations
Cited by 27 publications
(19 citation statements)
references
References 20 publications
0
18
0
Order By: Relevance
“…Our setup differs from recent methods in ImageNav where panoramic 360°FoV sensors are required [15,39,46]. Here, we consider a standard 90°FoV for the agent's view [72].…”
Section: Semantic Search Policy For Image Goalsmentioning
confidence: 99%
“…Our setup differs from recent methods in ImageNav where panoramic 360°FoV sensors are required [15,39,46]. Here, we consider a standard 90°FoV for the agent's view [72].…”
Section: Semantic Search Policy For Image Goalsmentioning
confidence: 99%
“…a) Navigation approaches: Traditional approaches to visual navigation focus on building a 3D metric map of the environment [18], [3] before using that representation for any downstream navigation tasks, which does not lend itself favourably for task-driven learnable representations that can capture contextual cues. The recent introduction of largescale indoor environments and simulators [7], [17], [6] has fuelled a slew of learning based methods for indoor navigation tasks [1] such as point-goal [10], [19], [20], [21], [22], object-goal [23], [24], [25], [26], [27], and image-goal [8], [28], [29]. Modular approaches which incorporate explicit or learned map representations [11], [23], [25] have shown to outperform end-to-end methods on tasks such as object-goal, however, this is not currently the case for the point-goal [10], [20] task.…”
Section: Related Workmentioning
confidence: 99%
“…In other words, the authors completely replaced the RNN with an attention mechanism, which shows good performance in long-term tasks. Visual graph memory (VGM) [18] constructs a topological visual memory to navigate an environment while not utilizing the landmark information of the scene. No RL no simulator (NRNS) [27] uses models trained using image input without interaction with the simulator.…”
Section: Related Workmentioning
confidence: 99%
“…A crucial ingredient for successful visual navigation is to construct a memory, which can represent the structure of the environment along with compact visual features for representing high-dimensional visual inputs. A metric-map memory [5,17] created with SLAM, and a graph memory [8,9,[18][19][20] with nodes and edges are the two standard memory construction approaches for navigation algorithms. Even though navigation systems that use metric maps produce powerful results with exact localization and mapping, it is not practical because the navigation agent is susceptible to sensory noises.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation