“…Both of the examples show the feasibility and effectiveness of the proposed algorithms.KEYWORDS approximation dynamic programming (ADP), continuous-time systems, integral reinforcement learning (IRL), online learning, value iteration SU ET AL.heuristic dynamic programming (HDP), action-dependent HDP, dual HDP (DHP), action-dependent DHP, globalized DHP, and action-dependent GDHP. 32,33 In addition, from an implementation point of view, the iteration schemes of ADP can be divided into 2 classes: policy iteration algorithms and value iteration algorithms.The implementation process of the policy iteration method should start with a given initial admissible policy (the definition will be given herein). However, by now, how to obtain an admissible policy is still an open issue.…”