A self-learning energy management is proposed for plug-in hybrid electric bus, by combining Q-Learning (QL) and Pontryagin's minimum principle algorithms. Different from the existing strategies, the expert experience and generalization performance are focused in the proposed strategy. The expert experience is designed as the approximately optimal reference state-of-charge (SOC) trajectories, and the generalization performance is enhanced by a multiply driving cycle training method. In specific, an efficient zone of SOC is firstly designed based on the approximately optimal reference SOC trajectories. Then, the agent of the QL is trained off-line by taking the expert experience as reference SOC trajectories.Finally, an adaptive strategy is proposed based on the well-trained agent. Specially, two different reward functions are defined. That is, the reward function in the off-line training mainly considers the tracking performance between the expert experience and the SOC, while mainly considering the punishment in the adaptive strategy. Simulation results show that the proposed strategy has good generalization performance and can improve the fuel economy by 22.49%, compared to a charge depleting-charge sustaining (CDCS) strategy.
K E Y W O R D Senergy management, expert experience, generalization performance, plug-in hybrid electric bus, Q-learning