Abstract-A two-timescale simulation-based actor-critic algorithm for solution of infinite horizon Markov decision processes with finite state and compact action spaces under the discounted cost criterion is proposed. The algorithm does gradient search on the slower timescale in the space of deterministic policies and uses simultaneous perturbation stochastic approximation-based estimates. On the faster scale, the value function corresponding to a given stationary policy is updated and averaged over a fixed number of epochs (for enhanced performance). The proof of convergence to a locally optimal policy is presented. Finally, numerical experiments using the proposed algorithm on flow control in a bottleneck link using a continuous time queueing model are shown.Index Terms-Actor-critic algorithms, Markov decision processes, simultaneous perturbation stochastic approximation (SPSA), two timescale stochastic approximation.
In order to comprehensively consider the energy transmission efficiency of the isolated Dual-Active-Bridge (DAB) DC-DC converter, this paper proposes a control method that maximizes the power transmission efficiency of the whole converter and optimizes the stress of the switching device by using the dual-phase-shifting control mode under the premise of simultaneously considering current stress, back-flow power and power loss. Firstly, we compare and analyze the advantages and disadvantages of DPS, SPS and TPS, and finally determine the control method using DPS. Then analyze and compare the basic principles of the traditional current stress optimization target control method and the comprehensive efficiency optimal control method proposed in this paper. We compared the working mode analysis with the stability, current stress and application range of soft-switching. The adaptive genetic algorithm is used to optimize the two duty ratio degrees of freedom of the control strategy, and the optimal duty ratio corresponding to the objective function is obtained. Finally, the correctness of the control method and the superiority of the power transmission efficiency are proved by simulation and experimental comparison.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.