Learning and Exploiting Shaped Reward Models for Large Scale Multiagent RL

Singh, Arambam James; Kumar, Ajay; Lau, Hoong Chuin

doi:10.1609/icaps.v31i1.16007

Cited by 2 publications

(1 citation statement)

References 21 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In order to achieve these objectives, each agent collaborates with neighboring nodes to increase its own reward and contribute to the rewards of others. Drawing inspiration from the principles of frequent measurability and spatial decomposability [57], we have defined a local reward function, r, for each intersection, which enables us to achieve these goals in a highly efficient and effective manner.…”

Section: ) Reward Functionmentioning

confidence: 99%

Collaborative Traffic Signal Automation Using Deep Q-Learning

Hassan,

Elhadef,

Khan

2023

IEEE Access

View full text Add to dashboard Cite

Multi-agent deep reinforcement learning (MDRL) is a popular choice for multi-intersection traffic signal control, generating decentralized cooperative traffic signal strategies in specific traffic networks. Despite its widespread use, current MDRL algorithms have certain limitations. Firstly, the specific multi-agent settings impede the transferability and generalization of traffic signal policies to different traffic networks. Secondly, existing MDRL algorithms struggle to adapt to a varying number of vehicles crossing the traffic networks. This paper introduces a novel Cooperative Multi-Agent Deep Q-Network (CMDQN) for traffic signal control to alleviate traffic congestion. We have considered innovative features such as signal state at the preceding junction, the distance between junctions, visual features, and average speed. Our CMDQN applies a Decentralized Multi-Agent Network (DMN), employing a Markov Game abstraction for collaboration and state information sharing between agents to reduce waiting times. Our work employs Reinforcement Learning (RL) and a Deep Q-Network (DQN) for adaptive traffic signal control, leveraging Deep Computer Vision for real-time traffic density information. We also propose an intersection and a network-wide reward function to evaluate performance and optimize traffic flow. The developed system was evaluated through both synthetic and real-world experiments. The synthetic network is based on the Simulation of Urban Mobility (SUMO) traffic simulator, and the real-world network employed traffic data collected from installed cameras at actual traffic signals. Our results demonstrated improved performance across several key metrics when compared to the baseline model, reducing waiting times and improving traffic flow. This research presents a promising approach for cooperative traffic signal control, significantly contributing to the efforts to enhance traffic management systems. INDEX TERMS Reinforcement Learning (RL), Multi-agent deep reinforcement learning (MDRL), Computer Vision, Deep Q-Network (DQN), Simulation of Urban Mobility (SUMO), Decentralized Multi-Agent Network (DMN).

show abstract

Section: ) Reward Functionmentioning

confidence: 99%