Regularized Fitted Q-Iteration: Application to Planning

Farahmand, Amir-massoud; Ghavamzadeh, Mohammad; Szepesvári, Csaba; Mannor, Shie

doi:10.1007/978-3-540-89722-4_5

Cited by 18 publications

(18 citation statements)

References 15 publications

(16 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Although the approaches above are inspired by principled methods of supervised learning, not much is known about their statistical properties. Recently, Farahmand et al (2009Farahmand et al ( , 2008 have developed another regularization-based approach that comes with statistical guarantees. The difficulty of using (some) nonparametric techniques is that they are computationally expensive.…”

Section: The Choice Of the Function Spacementioning

confidence: 99%

Algorithms for Reinforcement Learning

Szepesvári¹

2010

Synthesis Lectures on Artificial Intelligence and Machine Learn

Self Cite

604

406

View full text Add to dashboard Cite

Section: The Choice Of the Function Spacementioning

confidence: 99%

Algorithms for Reinforcement Learning

Szepesvári¹

2010

Synthesis Lectures on Artificial Intelligence and Machine Learn

Self Cite

604

406

View full text Add to dashboard Cite

“…This way we hope to bring the strength of a powerful supervised learning algorithm to the planning problem. See [12] for more information about RFQI and more precise statements about its theoretical guarantees. It is noteworthy to mention that there have been a few attempts to use regularization in reinforcement learning such as [21] and [22].…”

Section: Regularized Fitted Q-iterationmentioning

confidence: 99%

“…We refer the reader to [23] and [12] for further details. Reader who is not interested in rigorous definitions or is already familiar with them may just skip to Section IV-B.…”

Section: A Reinforcement Learning Background and Notationsmentioning

confidence: 99%

“…This globally-valid model can be used for designing a local controller and/or global planner. Afterwards, Regularized Fitted Q-Iteration (RFQI) [12] will be introduced that finds a close to optimal solution for the learning/planning problem (Section IV). This recently proposed nonparametric reinforcement learning (RL) method uses joint values data and a reward signal to find a policy that maximizes an objective functional.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Model-based and model-free reinforcement learning for visual servoing

Farahmand

Shademan

Jägersand

et al. 2009

2009 IEEE International Conference on Robotics and Automation

View full text Add to dashboard Cite

Abstract-To address the difficulty of designing a controller for complex visual-servoing tasks, two learning-based uncalibrated approaches are introduced. The first method starts by building an estimated model for the visual-motor forward kinematic of the vision-robot system by a locally linear regression method. Afterwards, it uses a reinforcement learning method named Regularized Fitted Q-Iteration to find a controller (i.e. policy) for the system (model-based RL). The second method directly uses samples coming from the robot without building any intermediate model (model-free RL). The simulation results show that both methods perform comparably well despite not having any a priori knowledge about the robot.

show abstract

“…For instance, Farahmand et al [3] have used regularized least squares for regularized fitted Q-learning. When using neural networks, weight decay can likewise be used.…”

Section: Introductionmentioning

confidence: 99%

Approximate model-assisted Neural Fitted Q-Iteration

Lampe

Riedmiller

2014

2014 International Joint Conference on Neural Networks (IJCNN)

View full text Add to dashboard Cite

Abstract-In this work, we propose an extension to the Neural Fitted Q-Iteration algorithm that utilizes a learned model to generate virtual trajectories which are used for updating the Q-function. Compared to standard NFQ, this combination has the potential to greatly reduce the amount of system interaction required to learn a good policy. At the same time, the approach still maintains the generalization ability of Q-learning. We provide a general formulation for approximate model-assisted fitted Q-learning, and examine the advantages of its neural implementation regarding interaction time and robustness. Its capabilities are illustrated with first results on a benchmark cart-pole regulation task, on which our method turns out to provide more general policies using much less interaction time.

show abstract

Regularized Fitted Q-Iteration: Application to Planning

Cited by 18 publications

References 15 publications

Algorithms for Reinforcement Learning

Algorithms for Reinforcement Learning

Model-based and model-free reinforcement learning for visual servoing

Approximate model-assisted Neural Fitted Q-Iteration

Contact Info

Product

Resources

About