A Deterministic Improved Q-Learning for Path Planning of a Mobile Robot

Konar, Amit; Chakraborty, Indrani Goswami; Singh, Sapam Jitu; Jain, Lakhmi C.; Nagar, Atulya K.

doi:10.1109/tsmca.2012.2227719

Cited by 202 publications

(83 citation statements)

References 27 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Konar et al [24] have applied Q-learning to path planning of a mobile robot. However, if the state spaces become very large or continuous, these algorithms will be computation expensive and impractical for applications.…”

Section: Related Workmentioning

confidence: 99%

“…The other contribution of this paper is the application of value function approximation with linear architecture in path planning of mobile robots. Tabularbased reinforcement learning methods, like Q-learning, have been applied in path planning with discrete state spaces [24]. However, little work has been done to use linear function approximation methods to deal with the problem of path planning in large or continuous state spaces.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

A hierarchical path planning approach based on A ⁎ and least-squares policy iteration for mobile robots

Zuo

Guo

et al. 2015

Neurocomputing

View full text Add to dashboard Cite

Section: Related Workmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

A hierarchical path planning approach based on A ⁎ and least-squares policy iteration for mobile robots

Zuo

Guo

et al. 2015

Neurocomputing

View full text Add to dashboard Cite

“…μ 0 is an admissible control policy and Q μ 0 is the Q-function of μ 0 . The approximate Q-functionQμ i satisfies the iterative error condition (38). Then, the Q-function sequence {Qμ i } approaches Q * according to the following inequalities:…”

Section: B Error Bound For Approximate Policy Iterationmentioning

confidence: 99%

Error Bound Analysis of $Q$ -Function for Discounted Optimal Control Problems With Policy Iteration

Yan

Wang

et al. 2017

IEEE Trans. Syst. Man Cybern, Syst.

View full text Add to dashboard Cite

In this paper, we present error bound analysis of the Q-function for the action-dependent adaptive dynamic programming for solving discounted optimal control problems of unknown discrete-time nonlinear systems. The convergence of Q-functions derived by a policy iteration algorithm under ideal conditions is given. Considering the approximated errors of the Q-function and control policy in the policy evaluation step and policy improvement step, we establish error bounds of approximate Q-functions in each iteration. With the given boundedness conditions, the approximate Q-function will converge to a finite neighborhood of the optimal Q-function. To implement the presented algorithm, two three-layer neural networks are employed to approximate the Q-function and the control policy, respectively. Finally, a simulation example is utilized to verify the validity of the presented algorithm.Index Terms-Adaptive dynamic programming (ADP), error analysis, nonlinear systems, policy iteration, Q-function.

show abstract

“…The success of Q-Learning has lead to many applications, such as path planning [25,31], energy management [30], routing in vehicular ad-hoc networks [42], management of water resources [27], and production planning [10].…”

Section: Q-learningmentioning

confidence: 99%

Reinforcement Learning endowed with safe veto policies to learn the control of Linked-Multicomponent Robotic Systems

Fernández-Gauna

Graña

López-Guede

et al. 2015

Information Sciences

View full text Add to dashboard Cite

A Deterministic Improved Q-Learning for Path Planning of a Mobile Robot

Cited by 202 publications

References 27 publications

A hierarchical path planning approach based on A ⁎ and least-squares policy iteration for mobile robots

A hierarchical path planning approach based on A ⁎ and least-squares policy iteration for mobile robots

Error Bound Analysis of $Q$ -Function for Discounted Optimal Control Problems With Policy Iteration

Reinforcement Learning endowed with safe veto policies to learn the control of Linked-Multicomponent Robotic Systems

Contact Info

Product

Resources

About