2009 American Control Conference 2009
DOI: 10.1109/acc.2009.5160611
|View full text |Cite
|
Sign up to set email alerts
|

Regularized Fitted Q-Iteration for planning in continuous-space Markovian decision problems

Abstract: Reinforcement learning with linear and non-linear function approximation has been studied extensively in the last decade. However, as opposed to other fields of machine learning such as supervised learning, the effect of finite sample has not been thoroughly addressed within the reinforcement learning framework. In this paper we propose to use regularization in reinforcement learning and planning. More specifically, we control the complexity of the value function approximation using L 2 regularization. We cons… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
58
0

Year Published

2010
2010
2023
2023

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 43 publications
(58 citation statements)
references
References 14 publications
0
58
0
Order By: Relevance
“…A rich body of literature concerns the analysis of approximate value iteration, both in the DP (model-based) setting [10,14,22,23,26] and in the RL (model-free) setting [1,12,25]. In many cases, convergence is ensured by using linearly parameterized approximators [14,25,26].…”
Section: Related Workmentioning
confidence: 99%
See 4 more Smart Citations
“…A rich body of literature concerns the analysis of approximate value iteration, both in the DP (model-based) setting [10,14,22,23,26] and in the RL (model-free) setting [1,12,25]. In many cases, convergence is ensured by using linearly parameterized approximators [14,25,26].…”
Section: Related Workmentioning
confidence: 99%
“…Such discretizations sometimes use interpolation schemes similar to fuzzy approximation. A different class of results analyzes the performance of approximate value iteration for stochastic processes, when only a limited number of samples are available [1,12,22].…”
Section: Related Workmentioning
confidence: 99%
See 3 more Smart Citations