2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition 2018
DOI: 10.1109/cvpr.2018.00135
|View full text |Cite
|
Sign up to set email alerts
|

Learning to Look Around: Intelligently Exploring Unseen Environments for Unknown Tasks

Abstract: It is common to implicitly assume access to intelligently captured inputs (e.g., photos from a human photographer), yet autonomously capturing good observations is itself a major challenge. We address the problem of learning to look around: if a visual agent has the ability to voluntarily acquire new views to observe its environment, how can it learn efficient exploratory behaviors to acquire informative observations? We propose a reinforcement learning solution, where the agent is rewarded for actions that re… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
86
0

Year Published

2018
2018
2020
2020

Publication Types

Select...
5
1

Relationship

4
2

Authors

Journals

citations
Cited by 77 publications
(88 citation statements)
references
References 61 publications
(79 reference statements)
2
86
0
Order By: Relevance
“…As a core solution to these challenges, we present a reinforcement learning (RL) approach for active observation completion (23). See Figure 2.…”
Section: Approach Overviewmentioning
confidence: 99%
See 1 more Smart Citation
“…As a core solution to these challenges, we present a reinforcement learning (RL) approach for active observation completion (23). See Figure 2.…”
Section: Approach Overviewmentioning
confidence: 99%
“…wherex t (X, θ (i) ) denotes the reconstructed view at viewpoint θ (i) and time t, d denotes the pixelwise reconstruction MSE, and ∆ 0 denotes the offset to account for the unknown starting azimuth (23).…”
Section: Policy Learning Formulationmentioning
confidence: 99%
“…Decoder: To learn a representation with this property, the output of the encoder is processed through another fully connected layer to increase its dimensionality The complete architecture, together with more detailed specifications, is visualized in Fig 2. Our convolutional encoder-decoder [43] neural network architecture is similar to [30,58,66,69]. As discussed above, however, the primary focus of our work is very different.…”
Section: Network Architecture and Trainingmentioning
confidence: 99%
“…How then can it know the correct viewpoint coordinates for the viewgrid it must produce? It instead produces viewgrids aligned with the observed viewpoint at the azimuthal coordinate origin, similar to [30]. Azimuthal rotations of a given viewgrid all form an equivalence class.…”
Section: Network Architecture and Trainingmentioning
confidence: 99%
“…The influential work of [25] learns a policy for visual attention in image classification. Active perception systems use RNNs and/or reinforcement learning to select places to look in a novel image [26,27], environment [28,29,30], or video [31,32,33,34] to detect certain objects or activities efficiently. Broadly construed, we share the general goal of efficiently converging on a desired target "view", but our problem domain is entirely different.…”
Section: Related Workmentioning
confidence: 99%