2020
DOI: 10.1109/access.2020.2968937
|View full text |Cite
|
Sign up to set email alerts
|

A Distributed Control Method for Urban Networks Using Multi-Agent Reinforcement Learning Based on Regional Mixed Strategy Nash-Equilibrium

Abstract: Urban network traffic congestion can be caused by disturbances, such as fluctuation and disequilibrium of traffic demand. This paper designs a distributed control method for preventing disturbancebased urban network traffic congestion by integrating Multi-Agent Reinforcement Learning (MARL) and regional Mixed Strategy Nash-Equilibrium (MSNE). To enhance the disturbance-rejection performance of Urban Network Traffic Control (UNTC), a regional MSNE concept is integrated, which models the competitive relationship… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
8
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
6
1

Relationship

1
6

Authors

Journals

citations
Cited by 16 publications
(8 citation statements)
references
References 58 publications
0
8
0
Order By: Relevance
“…(El-Tantawy and Abdulhai, 2010;El-Tantawy et al, 2013). Famous methods proposed in traffic theory context, such as CTM (Chanloha et al, 2014;Ajorlou et al, 2015;Qu et al, 2020), Max-plus (Kuyer et al, 2008;Medina and Benekohal, 2012), and Max Pressure (MP) or back pressure (brought in NTSC from the communications networks theory) (Wei et al, 2019a), are also applied in some research, and in other studies multiple traffic optimization goals are simultaneously optimized (multi-policy RL), e.g. (Dusparic et al, 2016).…”
Section: Methods' Contribution and Combinationmentioning
confidence: 99%
“…(El-Tantawy and Abdulhai, 2010;El-Tantawy et al, 2013). Famous methods proposed in traffic theory context, such as CTM (Chanloha et al, 2014;Ajorlou et al, 2015;Qu et al, 2020), Max-plus (Kuyer et al, 2008;Medina and Benekohal, 2012), and Max Pressure (MP) or back pressure (brought in NTSC from the communications networks theory) (Wei et al, 2019a), are also applied in some research, and in other studies multiple traffic optimization goals are simultaneously optimized (multi-policy RL), e.g. (Dusparic et al, 2016).…”
Section: Methods' Contribution and Combinationmentioning
confidence: 99%
“…The state of each agent is the vector of states at all links (edges) in TN. Besides, it is worth noting that rationalize the size of state space to reduce computational complexity [30]. Fortunately, the state of TN only needs to be detected once to be accessible by all agents.…”
Section: ) Adviser Agentmentioning
confidence: 99%
“…It is necessary to define a special reward function modality for main agents. .Formula (27) shows that the travel cost input variable of a sub-action has two components: subaction effect estimation † j defined by integrating equations (29) and (30). Formula (28) models the utility among (28) and (29) can be defined in a similar way to (17).…”
Section: (D) Reward Functionmentioning
confidence: 99%
See 2 more Smart Citations