2019
DOI: 10.3390/en12183461
|View full text |Cite
|
Sign up to set email alerts
|

Application of a Deep Deterministic Policy Gradient Algorithm for Energy-Aimed Timetable Rescheduling Problem

Abstract: Reinforcement learning has potential in the area of intelligent transportation due to its generality and real-time feature. The Q-learning algorithm, which is an early proposed algorithm, has its own merits to solve the train timetable rescheduling (TTR) problem. However, it has shortage in two aspects: Dimensional limits of action and a slow convergence rate. In this paper, a deep deterministic policy gradient (DDPG) algorithm is applied to solve the energy-aimed train timetable rescheduling (ETTR) problem. T… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
9
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
7
2

Relationship

2
7

Authors

Journals

citations
Cited by 21 publications
(10 citation statements)
references
References 25 publications
0
9
0
Order By: Relevance
“…But the value of the disturbance is random. e time model defines the departure instant t de , travel time t tr , and dwell time t dw of each train at each station [33]. e starting station is defined as station no.…”
Section: Assumptionsmentioning
confidence: 99%
See 1 more Smart Citation
“…But the value of the disturbance is random. e time model defines the departure instant t de , travel time t tr , and dwell time t dw of each train at each station [33]. e starting station is defined as station no.…”
Section: Assumptionsmentioning
confidence: 99%
“…If the braking power is less than the traction power, it can be fully used; otherwise, resistors will kick in and consume the overflowing braking power to maintain the train voltage under a safe value. e minimum value of traction power and braking conversion power is defined as P F [33]:…”
Section: Decision Variablesmentioning
confidence: 99%
“…Reinforcement learning is a computational approach to learning whereby an agent tries to maximize the total amount of reward it receives when interacting with environment (see Figure 4) [28,29]. For every state s, the agents always try to maximize the expected discounted return by choosing an action a.…”
Section: Dueling Deep Q-network Architecturementioning
confidence: 99%
“…Nitisiri et al [29] developed a parallel multi-objective evolutionary algorithm with hybrid sampling and learning-based mutation to solve the train scheduling problem. To further investigate the rail timetable rescheduling problem, Yang et al [30] introduced the Deep Deterministic Policy Gradient (DDPG) algorithm over a conventional Q-learning strategy to realize the energy-aimed rescheduling by adjusting the cruising speed and dwelling time continuously. It can be found that energy consumption level of a given timetable is actually an inherent attribute.…”
Section: Introductionmentioning
confidence: 99%