Management of Traffic Signals using Deep Reinforcement Learning in Bidirectional Recurrent Neural Network in ITS

Paul, Ananya; Mitra, Sulata

doi:10.1145/3461598.3461608

Cited by 8 publications

(6 citation statements)

References 5 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Typically, the agent is trained for numerous episodes to predict the expected cumulative discounted future reward (

G_{t}

) when certain actions are applied to certain states. The primary goal is to identify a policy function

π_{θ} false(a_{t} false| s_{t} false), a_{t} \in bold-italicA, s_{t} \in bold-italicS

, which maximizes

G_{t}

as (2) [9].

G_{t} = \sum_{i = t}^{T} {bold-italicγ}^{i - t} R_{i},

where

T

represents the number of iterations in an episode.…”

Section: Present Workmentioning

confidence: 99%

“…The bidirectional layers in long short‐term memorysmemories (LSTMs) and gated recurrent units (GRUs) allow networks to include both backward and forward state information at each time step, ensuring that they are adequately equipped to handle enormous data [9].…”

Section: Present Workmentioning

confidence: 99%

“…Typically, the agent is trained for numerous episodes to predict the expected cumulative discounted future reward (G t ) when certain actions are applied to certain states. The primary goal is to identify a policy function π θ ða t js t Þ, a t A,s t S, which maximizes G t as (2) [9].…”

Section: Drlmentioning

confidence: 99%

“…In transformer, the spatiotemporal input is d number of spatial sequence state inputs, with x representing the number of each spatial input (Figure 4). Each of the d number of spatial inputs received a positional embedding PE ℝ d * x as ( 8) and (9).…”

Section: Network Architecturementioning

confidence: 99%

“…The bidirectional layers in long short-term memorysmemories (LSTMs) and gated recurrent units (GRUs) allow networks to include both backward and forward state information at each time step, ensuring that they are adequately equipped to handle enormous data [9]. In transformer, the spatiotemporal input is d number of spatial sequence state inputs, with x representing the number of each spatial input (Figure 4).…”

Section: Network Architecturementioning

confidence: 99%

See 4 more Smart Citations

Exploring reward efficacy in traffic management using deep reinforcement learning in intelligent transportation system

Paul¹,

Mitra²

2022

ETRI Journal

Self Cite

View full text Add to dashboard Cite

In the last decade, substantial progress has been achieved in intelligent traffic control technologies to overcome consistent difficulties of traffic congestion and its adverse effect on smart cities. Edge computing is one such advanced progress facilitating real-time data transmission among vehicles and roadside units to mitigate congestion. An edge computing-based deep reinforcement learning system is demonstrated in this study that appropriately designs a multiobjective reward function for optimizing different objectives. The system seeks to overcome the challenge of evaluating actions with a simple numerical reward. The selection of reward functions has a significant impact on agents' ability to acquire the ideal behavior for managing multiple traffic signals in a large-scale road network. To ascertain effective reward functions, the agent is trained withusing the proximal policy optimization method in several deep neural network models, including the state-of-the-art transformer network. The system is verified using both hypothetical scenarios and real-world traffic maps. The comprehensive simulation outcomes demonstrate the potency of the suggested reward functions.

show abstract

“…Typically, the agent is trained for numerous episodes to predict the expected cumulative discounted future reward (

G_{t}

) when certain actions are applied to certain states. The primary goal is to identify a policy function

π_{θ} false(a_{t} false| s_{t} false), a_{t} \in bold-italicA, s_{t} \in bold-italicS

, which maximizes

G_{t}

as (2) [9].

G_{t} = \sum_{i = t}^{T} {bold-italicγ}^{i - t} R_{i},

where

T

represents the number of iterations in an episode.…”

Section: Present Workmentioning

confidence: 99%

Section: Present Workmentioning

confidence: 99%

Section: Drlmentioning

confidence: 99%

Section: Network Architecturementioning

confidence: 99%

Section: Network Architecturementioning

confidence: 99%

See 3 more Smart Citations

Exploring reward efficacy in traffic management using deep reinforcement learning in intelligent transportation system

Paul¹,

Mitra²

2022

ETRI Journal

Self Cite

View full text Add to dashboard Cite

show abstract

Deep reinforcement learning based cooperative control of traffic signal for multi‐intersection network in intelligent transportation system using edge computing

Paul

2022

Trans Emerging Tel Tech

Self Cite

View full text Add to dashboard Cite

In the current era, the coordination of traffic flow is hindered by the discrepancy between road infrastructure and the number of vehicles which leads to traffic congestion. One of the widely used strategies to mitigate traffic congestion is to control traffic signals with the help of deep reinforcement learning (DRL) in edge computing based intelligent transportation system. This article provides a comprehensive analysis of the most recent DRL algorithms, advantage actor‐critic and proximal policy optimization in multiple deep neural networks (DNNs), including a state‐of‐the‐art transformer model for effective traffic signal management. Here, a single DRL agent is used, which obtains the spatio‐temporal information of the traffic to identify traffic patterns from complex intersection environments. The agent uses this information as the input to the DNNs and then applies the algorithms to retrieve the essential parameters of DNN to seek an optimal action selection policy to mitigate congestion. Different real‐time maps and small city networks are explored here to determine which DNN is best suited for traffic congestion management. The simulation study reveals that both the algorithms significantly outperform the baseline. The transformer model gives the best result when compared to other DNNs. The transformer model decreases average waiting time by 96.16%, implying that it has a higher capability of dealing with congested environments.

show abstract

An Intelligent Traffic Signal Management Strategy to Reduce Vehicles CO2 Emissions in Fog Oriented VANET

Paul

Haricharan

2021

Wireless Pers Commun

View full text Add to dashboard Cite

Management of Traffic Signals using Deep Reinforcement Learning in Bidirectional Recurrent Neural Network in ITS

Cited by 8 publications

References 5 publications

Exploring reward efficacy in traffic management using deep reinforcement learning in intelligent transportation system

Exploring reward efficacy in traffic management using deep reinforcement learning in intelligent transportation system

Deep reinforcement learning based cooperative control of traffic signal for multi‐intersection network in intelligent transportation system using edge computing

An Intelligent Traffic Signal Management Strategy to Reduce Vehicles CO2 Emissions in Fog Oriented VANET

Contact Info

Product

Resources

About