Proceedings of the 38th IEEE Conference on Decision and Control (Cat. No.99CH36304)
DOI: 10.1109/cdc.1999.830250
|View full text |Cite
|
Sign up to set email alerts
|

Q-learning algorithm using an adaptive-sized Q-table

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
4
0

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 8 publications
(4 citation statements)
references
References 8 publications
0
4
0
Order By: Relevance
“…Therefore, in the proposed method, binary trees for storing Q-values are constructed dynamically during the course of the learning, so that only Q-values corresponding states that are referred in learning process are stored (Hirashima et al, 1999). This feature can effectively reduce the required memory size for solving a marshalling problem and improve the solution.…”
Section: Data Storage Structure For Storing Q-valuesmentioning
confidence: 99%
“…Therefore, in the proposed method, binary trees for storing Q-values are constructed dynamically during the course of the learning, so that only Q-values corresponding states that are referred in learning process are stored (Hirashima et al, 1999). This feature can effectively reduce the required memory size for solving a marshalling problem and improve the solution.…”
Section: Data Storage Structure For Storing Q-valuesmentioning
confidence: 99%
“…existing researches is reducing latency by setting up models and designing optimal algorithms [3]. However, factors that cause the delay are diverse and dynamic, such as network latency, disk latency and other types of latency(RAM, CPU, etc.)…”
Section: Introductionmentioning
confidence: 99%
“…Therefore, the conventional reinforcement learning method, Q-learning, has great difficulties for solving the marshaling problem, due to its huge number of learning iterations and states required to obtain admissible operation of containers (Baum, 1999). Recently, a Q-learning method that can generate marshaling plan has been proposed (Hirashima et al, 1999). Although these methods were effective several cases, the desired layout was not achievable for every trial so that the early-phase performances of learning process can be degraded.…”
Section: Introductionmentioning
confidence: 99%