2019
DOI: 10.1016/j.apenergy.2019.04.021
|View full text |Cite
|
Sign up to set email alerts
|

Deep reinforcement learning of energy management with continuous control strategy and traffic information for a series-parallel plug-in hybrid electric bus

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
75
0
1

Year Published

2019
2019
2024
2024

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 241 publications
(76 citation statements)
references
References 35 publications
0
75
0
1
Order By: Relevance
“…The vehicle speed dataset is collected from the real driving conditions. To reduce the training time and improve the control accuracy, the vehicle velocity is divided into three-speed intervals that are [0-12] m/s, [12][13][14][15][16][17][18][19][20][21][22][23][24] m/s, [24][25][26][27][28][29][30][31][32][33][34][35][36] m/s, and they represent low, medium, and high speed, respectively. Then, the classified speed intervals are adopted to train the DDPG algorithm separately until the algorithm converges, the trained neural network is stored, as depicted in Fig.…”
Section: A a Bi-level Frameworkmentioning
confidence: 99%
See 1 more Smart Citation
“…The vehicle speed dataset is collected from the real driving conditions. To reduce the training time and improve the control accuracy, the vehicle velocity is divided into three-speed intervals that are [0-12] m/s, [12][13][14][15][16][17][18][19][20][21][22][23][24] m/s, [24][25][26][27][28][29][30][31][32][33][34][35][36] m/s, and they represent low, medium, and high speed, respectively. Then, the classified speed intervals are adopted to train the DDPG algorithm separately until the algorithm converges, the trained neural network is stored, as depicted in Fig.…”
Section: A a Bi-level Frameworkmentioning
confidence: 99%
“…et al employed the max-value-based policy and the random policy to reduce the overestimation for double Q-learning. Since the DQL cannot output continuous actions, Tan H. and Wu Y. et al [31], [32] proposed energy management strategies based on DDPG and simulation showed that DDPG-based strategy performs better than Q-learning and achieves results closed to DP. In [33] and [34], the authors join realtime topographic information, driver habits, and traffic condition in the training process of DDPG to improve the performance of EMS.…”
Section: Introductionmentioning
confidence: 99%
“…By equation 5, (6), (12), (13) and neglect all the inertia, the speed and torque relation among power sources and the torque output end is derived as follows…”
Section: Powertrain System Modelingmentioning
confidence: 99%
“…11 The optimization-based control strategies mainly include dynamic programming (DP), equivalent fuel consumption minimization strategy (ECMS), model predictive control (MPC), and deep reinforcement learning algorithm that have emerged with artificial intelligence in recent years. [12][13][14][15] Compared with the rule-based control strategies, the optimization-based control strategies don't need to divide the operation modes for the vehicle, the optimal or sub-optimal solution of the control can be obtained. Therefore, the fuel-saving effect is more obvious.…”
Section: Introductionmentioning
confidence: 99%
“…It does not rely on any prediction or predefined rules. Different learning based energy management strategies that are existing are discussed in this paper [21]- [26].…”
Section: Learning-based Emsmentioning
confidence: 99%