2008 IEEE International Conference on Networking, Sensing and Control 2008
DOI: 10.1109/icnsc.2008.4525304
|View full text |Cite
|
Sign up to set email alerts
|

Q-Learning for Adaptive Traffic Signal Control Based on Delay Minimization Strategy

Abstract: The goal of the paper is to test the performance of Q-learning for adaptive traffic signal control. For Q-learning algorithm, the state is total delay of the intersection, and the action is phase green time change. The relationship between phase green time change and action space is discussed. The performance between Q-learning and fixed cycle signal setting for isolated intersection is compared. The computation results show that Q-learning for traffic signal control can achieve lesser delay for variable traff… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
22
0

Year Published

2011
2011
2024
2024

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 31 publications
(22 citation statements)
references
References 5 publications
0
22
0
Order By: Relevance
“…Moreover, most of these studies have considered a simplified simulation environment Arel et al, 2010;Camponogara & Kraus, 2003;De Oliveira et al, 2006;Richter et al, 2007), and/or assumed hypothetical traffic flows Arel et al, 2010;Camponogara & Kraus, 2003;De Oliveira et al, 2006;Richter et al, 2007;Shoufeng et al, 2008;Thorpe, 1997) which does not necessarily mimic the reality. This article investigates the effect of the following design parameters to bridge this gap in the literature: (1) learning method (Q-Learning vs. SARSA vs. TD(λ)), (2) traffic state representation (queue length vs. queues and arrivals vs. delay), (3) action selection method (∈-greedy vs. softmax vs. ∈-softmax), (4) traffic signal phasing scheme (variable vs. fixed), (5) reward definition (delay vs. cumulative delay vs. balancing queues), and (6) variability of flow arrivals to the intersection (uniform vs. variable arrival rates).…”
Section: Rl-based Adaptive Traffic Signal Control: the State Of The Artmentioning
confidence: 98%
See 1 more Smart Citation
“…Moreover, most of these studies have considered a simplified simulation environment Arel et al, 2010;Camponogara & Kraus, 2003;De Oliveira et al, 2006;Richter et al, 2007), and/or assumed hypothetical traffic flows Arel et al, 2010;Camponogara & Kraus, 2003;De Oliveira et al, 2006;Richter et al, 2007;Shoufeng et al, 2008;Thorpe, 1997) which does not necessarily mimic the reality. This article investigates the effect of the following design parameters to bridge this gap in the literature: (1) learning method (Q-Learning vs. SARSA vs. TD(λ)), (2) traffic state representation (queue length vs. queues and arrivals vs. delay), (3) action selection method (∈-greedy vs. softmax vs. ∈-softmax), (4) traffic signal phasing scheme (variable vs. fixed), (5) reward definition (delay vs. cumulative delay vs. balancing queues), and (6) variability of flow arrivals to the intersection (uniform vs. variable arrival rates).…”
Section: Rl-based Adaptive Traffic Signal Control: the State Of The Artmentioning
confidence: 98%
“…This section focuses on the studies that considered RL for adaptive traffic signal control (Abdulhai, Pringle, & Karakoulas, 2003;Arel, Liu, Urbanik, & Kohls, 2010;Balaji, German, & Srinivasan, 2010;Camponogara & Kraus, 2003;De Oliveira et al, 2006;Richter, Aberdeen, & Yu, 2007;Salkham, Cunningham, Garg, & Cahill, 2008;Shoufeng, Ximin, & Shiqiang, 2008;Thorpe, 1997;Wiering, 2000). Table 1 and contrast these studies.…”
Section: Rl-based Adaptive Traffic Signal Control: the State Of The Artmentioning
confidence: 98%
“…Various algorithms have been developed for traffic management in the literature [7][8][9][10][11][12][13][14][15][16][17].…”
Section: Problem Definition and Motivationmentioning
confidence: 99%
“…When the state of Markov Decision Process is very big or continuous, the computation and memory load will become very big and can not be solved then. On the other side, in the traditional Q learning algorithm, Q value is updated in the form of table record, the effeciency of this kind of learning is relatively slow, which will directly influence the performance of the controller [2].…”
Section: Introductionmentioning
confidence: 99%
“…Sutton proposed a developed learning algorithm for non-deterministic Markov decision processes [1]. Lu Shoufeng applied table Q-learning to dynamically control the traffic signals at an isolated intersection [2]. Wei Wu also developed a coordinated urban traffic signal control approach based on multi-agent reinforcement learning [3].…”
Section: Introductionmentioning
confidence: 99%