2013
DOI: 10.1109/tsmca.2012.2227719
|View full text |Cite
|
Sign up to set email alerts
|

A Deterministic Improved Q-Learning for Path Planning of a Mobile Robot

Abstract: Abstract-The paper provides a new deterministic Q-learning with a presumed knowledge about the distance from the current state to both the next state and the goal. This knowledge is efficiently used to update the entries in the Q-table once only by utilizing four derived properties of the Q-learning, instead of repeatedly updating them like the classical Q-learning. Naturally, the proposed algorithm has an insignificantly small timecomplexity in comparison to its classical counterpart. Further, the proposed al… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
82
0
1

Year Published

2015
2015
2022
2022

Publication Types

Select...
6
2
1

Relationship

0
9

Authors

Journals

citations
Cited by 202 publications
(83 citation statements)
references
References 27 publications
0
82
0
1
Order By: Relevance
“…Konar et al [24] have applied Q-learning to path planning of a mobile robot. However, if the state spaces become very large or continuous, these algorithms will be computation expensive and impractical for applications.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Konar et al [24] have applied Q-learning to path planning of a mobile robot. However, if the state spaces become very large or continuous, these algorithms will be computation expensive and impractical for applications.…”
Section: Related Workmentioning
confidence: 99%
“…The other contribution of this paper is the application of value function approximation with linear architecture in path planning of mobile robots. Tabularbased reinforcement learning methods, like Q-learning, have been applied in path planning with discrete state spaces [24]. However, little work has been done to use linear function approximation methods to deal with the problem of path planning in large or continuous state spaces.…”
Section: Introductionmentioning
confidence: 99%
“…μ 0 is an admissible control policy and Q μ 0 is the Q-function of μ 0 . The approximate Q-functionQμ i satisfies the iterative error condition (38). Then, the Q-function sequence {Qμ i } approaches Q * according to the following inequalities:…”
Section: B Error Bound For Approximate Policy Iterationmentioning
confidence: 99%
“…The success of Q-Learning has lead to many applications, such as path planning [25,31], energy management [30], routing in vehicular ad-hoc networks [42], management of water resources [27], and production planning [10].…”
Section: Q-learningmentioning
confidence: 99%