Q-learning algorithm using an adaptive-sized Q-table

Hirashima, Yoichi; Iiguni, Youji; Masuda, Shimpei

doi:10.1109/cdc.1999.830250

Cited by 8 publications

(4 citation statements)

References 8 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Therefore, in the proposed method, binary trees for storing Q-values are constructed dynamically during the course of the learning, so that only Q-values corresponding states that are referred in learning process are stored (Hirashima et al, 1999). This feature can effectively reduce the required memory size for solving a marshalling problem and improve the solution.…”

Section: Data Storage Structure For Storing Q-valuesmentioning

confidence: 99%

An Intelligent Marshalling Plan Using a New Reinforcement Learning System for Container Yard Terminals

Hirashima¹

2008

New Developments in Robotics Automation and Control

View full text Add to dashboard Cite

Section: Data Storage Structure For Storing Q-valuesmentioning

confidence: 99%

An Intelligent Marshalling Plan Using a New Reinforcement Learning System for Container Yard Terminals

Hirashima¹

2008

New Developments in Robotics Automation and Control

View full text Add to dashboard Cite

“…existing researches is reducing latency by setting up models and designing optimal algorithms [3]. However, factors that cause the delay are diverse and dynamic, such as network latency, disk latency and other types of latency(RAM, CPU, etc.)…”

Section: Introductionmentioning

confidence: 99%

A Fast Q-Learning Based Data Storage Optimization for Low Latency in Data Center Networks

et al. 2020

View full text Add to dashboard Cite

Data storage optimizations (DS, e.g. low latency for data access) in data center networks(DCN) are difficult online-making problems. Previously, they are done with heuristics under static network models which highly rely on designers' understanding of the environment. Encouraged by recent successes in deep reinforcement learning techniques to solve intricate online assignment problems, we propose to use the Q-learning (QL) technique to train and learn from historical DS decisions, which can significantly reduce the data access delay. However, QL faces two challenges to be widely used in data centers. They are massive input data and the blindness on parameter settings which severely hamper the convergence of the learning process. To solve these two key problems, we develop an evolutionary QL scheme, named as LFDS (Low latency and Fast convergence Data Storage). In the initial stage of the LFDS, the input matrix of QL is sparse to shrink the dimensionality of the massive input data while retaining its information as much as possible. In the following training phase, a specialized neural network is adopted to achieves a quick approximation. To overcome the blindness during QL training, the two key parameters, learning rate, and discount rate are carefully tested with real data input and network architecture. The preferred range of learning rate and discount rate are recommended for the use of QL in data centers, which brings high training rewards and fast convergence. Extensive simulations with real-world data show that the data access latency is decreased by 23.5% and the convergence rate is increased by 15%. INDEX TERMS Data center networks, data access, latency, reinforcement learning, q-learning.

show abstract

“…Therefore, the conventional reinforcement learning method, Q-learning, has great difficulties for solving the marshaling problem, due to its huge number of learning iterations and states required to obtain admissible operation of containers (Baum, 1999). Recently, a Q-learning method that can generate marshaling plan has been proposed (Hirashima et al, 1999). Although these methods were effective several cases, the desired layout was not achievable for every trial so that the early-phase performances of learning process can be degraded.…”

Section: Introductionmentioning

confidence: 99%

An Intelligent Marshaling Plan Based on Multi-Positional Desired Layout in Container Yard Terminals

Hirashima¹

2007

Proceedings of the Fourth International Conference on Informatics in Control, Automation and Robotics

View full text Add to dashboard Cite

This paper proposes a new scheduling method for a marshaling in the container yard terminal. The proposed method is derived based on Q-Learning algorithm considering the desired position of containers that are to be loaded into a ship. In the method, 3 processes can be optimized simultaneously: rearrangement order of containers, layout of containers assuring explicit transfer of container to the desired position, and removal plan for preparing the rearrange operation. Moreover, the proposed method generates several desired positions for each container, so that the learning performance of the method can be improved as compared to the conventional methods. In general, at container yard terminals, containers are stacked in the arrival order. Containers have to be loaded into the ship in a certain order, since each container has its own shipping destination and it cannot be rearranged after loading. Therefore, containers have to be rearranged from the initial arrangement into the desired arrangement before shipping. In the problem, the number of container-arrangements increases by the exponential rate with increase of total count of containers, and the rearrangement process occupies large part of total run time of material handling operation at the terminal. For this problem, conventional methods require enormous time and cost to derive an admissible result. In order to show effectiveness of the proposed method, computer simulations for several examples are conducted.

show abstract

Q-learning algorithm using an adaptive-sized Q-table

Cited by 8 publications

References 8 publications

An Intelligent Marshalling Plan Using a New Reinforcement Learning System for Container Yard Terminals

An Intelligent Marshalling Plan Using a New Reinforcement Learning System for Container Yard Terminals

A Fast Q-Learning Based Data Storage Optimization for Low Latency in Data Center Networks

An Intelligent Marshaling Plan Based on Multi-Positional Desired Layout in Container Yard Terminals

Contact Info

Product

Resources

About