Robotics: Science and Systems VI 2010
DOI: 10.15607/rss.2010.vi.036
|View full text |Cite
|
Sign up to set email alerts
|

Closing the Learning-Planning Loop with Predictive State Representations

Abstract: A central problem in artificial intelligence is to plan to maximize future reward under uncertainty in a partially observable environment. Models of such environments include Partially Observable Markov Decision Processes (POMDPs) [4] as well as their generalizations, Predictive State Representations (PSRs) [9] and Observable Operator Models (OOMs) [7]. POMDPs model the state of the world as a latent variable; in contrast, PSRs and OOMs represent state by tracking occurrence probabilities of a set of future ev… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
153
0

Year Published

2011
2011
2018
2018

Publication Types

Select...
5
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 73 publications
(154 citation statements)
references
References 6 publications
(9 reference statements)
1
153
0
Order By: Relevance
“…Thus, learning predictive state distribution incorporates the generation of a set of tests and the evaluations of these test samples. On a high level, learning how to generate and evaluate test samples can be can be seen as a model learning problem, where two models (i.e., for test generation and evaluation) need to be approximated (Boots et al 2010). Formulated in a probabilistic framework, these probabilistic models can be estimated empirically from samples (Boots et al 2010).…”
Section: Mixed Modelsmentioning
confidence: 99%
See 1 more Smart Citation
“…Thus, learning predictive state distribution incorporates the generation of a set of tests and the evaluations of these test samples. On a high level, learning how to generate and evaluate test samples can be can be seen as a model learning problem, where two models (i.e., for test generation and evaluation) need to be approximated (Boots et al 2010). Formulated in a probabilistic framework, these probabilistic models can be estimated empirically from samples (Boots et al 2010).…”
Section: Mixed Modelsmentioning
confidence: 99%
“…In the first case, learning algorithms have to deal with massive amounts of data, such as in learning inverse dynamics. In this scenario, the algorithms need to be efficient in terms of computation without sacrificing the learning accuracy (Bottou et al 2007). In the second scenario, there is only few data available for learning, as the data generation may be too tedious and expensive.…”
Section: Algorithmic Constraintsmentioning
confidence: 99%
“…Popular latent variable models of stochastic processes are often learned using heuristics such as Expectation Maximization (EM), which suffer from bad local optima and slow convergence rates. Recent PSR learning algorithms rely on spectral methods [10], [27] and kernel methods [11] which are statistically consistent.…”
Section: Related Workmentioning
confidence: 99%
“…This problem is far more difficult than the simulated problems explored in previous PSR work [10], [12]- [14]. Additionally, the manipulator used in our experiments has many additional degrees of freedom compared to the systems considered in recent work on bootstrapping in robotics [15]- [17].…”
Section: Introductionmentioning
confidence: 96%
See 1 more Smart Citation