“…Adaptive dynamic programming (ADP) [1][2][3][4], which integrates the advantages of reinforcement learning (RL) [5][6][7][8] and adaptive control, has become a powerful tool in solving optimal control problems. With decades of development, ADP has also provided many approaches to solve other control problems, such as robust control [9,10], optimal control with input constraints [11,12], optimal tracking control [13,14], zero-sum games [15], and non-zero-sum games [16]. Furthermore, ADP methods have been widely applied to the real-world systems, such as water-gas shift reaction [17], battery management [18], microgrid systems [19,20], and Quanser helicopter [21].…”