2008
DOI: 10.1007/978-3-540-89722-4_5
|View full text |Cite
|
Sign up to set email alerts
|

Regularized Fitted Q-Iteration: Application to Planning

Abstract: Abstract. We consider planning in a Markovian decision problem, i.e., the problem of finding a good policy given access to a generative model of the environment. We propose to use fitted Q-iteration with penalized (or regularized) least-squares regression as the regression subroutine to address the problem of controlling model-complexity. The algorithm is presented in detail for the case when the function space is a reproducingkernel Hilbert space underlying a user-chosen kernel function. We derive bounds on t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
16
0

Year Published

2009
2009
2018
2018

Publication Types

Select...
4
2

Relationship

1
5

Authors

Journals

citations
Cited by 18 publications
(18 citation statements)
references
References 15 publications
(16 reference statements)
0
16
0
Order By: Relevance
“…Although the approaches above are inspired by principled methods of supervised learning, not much is known about their statistical properties. Recently, Farahmand et al (2009Farahmand et al ( , 2008 have developed another regularization-based approach that comes with statistical guarantees. The difficulty of using (some) nonparametric techniques is that they are computationally expensive.…”
Section: The Choice Of the Function Spacementioning
confidence: 99%
“…Although the approaches above are inspired by principled methods of supervised learning, not much is known about their statistical properties. Recently, Farahmand et al (2009Farahmand et al ( , 2008 have developed another regularization-based approach that comes with statistical guarantees. The difficulty of using (some) nonparametric techniques is that they are computationally expensive.…”
Section: The Choice Of the Function Spacementioning
confidence: 99%
“…This way we hope to bring the strength of a powerful supervised learning algorithm to the planning problem. See [12] for more information about RFQI and more precise statements about its theoretical guarantees. It is noteworthy to mention that there have been a few attempts to use regularization in reinforcement learning such as [21] and [22].…”
Section: Regularized Fitted Q-iterationmentioning
confidence: 99%
“…We refer the reader to [23] and [12] for further details. Reader who is not interested in rigorous definitions or is already familiar with them may just skip to Section IV-B.…”
Section: A Reinforcement Learning Background and Notationsmentioning
confidence: 99%
See 1 more Smart Citation
“…For instance, Farahmand et al [3] have used regularized least squares for regularized fitted Q-learning. When using neural networks, weight decay can likewise be used.…”
Section: Introductionmentioning
confidence: 99%