Aim to improve the power efficiency of the dual-active-bridge (DAB) DC-DC converter, an efficiency optimization scheme with triple-phase-shift (TPS) modulation using reinforcement learning (RL) is proposed in this paper. More specifically, the Q-learning algorithm, as a typical algorithm of the RL, is applied to train an agent offline to obtain an optimized modulation strategy, and then the trained agent provide control decision online in real-time manner for the DAB DC-DC converter according to the current operation environment. The main objective is to obtain the optimal phase-shift angles for the DAB DC-DC converter, which can achieve the maximum power efficiency by reducing the power losses. Moreover, all possible operation modes of the TPS modulation are considered during the offline training process of the Q-learning algorithm. Thus, the cumbersome process for selecting the optimal operation mode in the conventional schemes can be circumvented successfully. Based on these merits, the proposed efficiency optimization scheme using the RL can realize the excellent performances for the whole load conditions and voltage conversion ratios. Finally, a 1.2 KW prototyped is built, and the simulation and the experimental results demonstrate that the power efficiency can be improved by using the optimization scheme based on the RL. 1
Dynamic area coverage is widely used in military and civil fields. Improving coverage efficiency is an important research direction for multi-agent dynamic area coverage. In this paper, we focus on the non-optimal coverage problem of free dynamic area coverage algorithms. We propose a distributed dynamic area coverage algorithm based on reinforcement learning and a γ-information map. The γ-information map can transform the continuous dynamic coverage process into a discrete γ point traversal process, while ensuring no-hole coverage. When agent communication covers the whole target area, agents can obtain the global optimal coverage strategy by learning the whole dynamic coverage process. In the event that communication does not cover the whole target area, agents can obtain a local optimal coverage strategy; in addition, agents can use the proposed algorithm to obtain a global optimal coverage path through off-line planning. Simulation results demonstrate that the required time for area coverage with the proposed algorithm is close to the optimal value, and the performance of the proposed algorithm is significantly better than the distributed anti-flocking Algorithms for dynamic area coverage. INDEX TERMS Dynamic area coverage, multi-agent, reinforcement learning, optimal coverage.
Aiming to reduce the current stress and improve the power efficiency of the dual active bridge (DAB) converter, this paper proposes a reinforcement learning (RL) + artificial neural network (ANN) based minimum-current-stress scheme. In the first stage, the Q-learning as a typical algorithm of the RL method, is adopted for offline training. The aim of the first stage is to solve the optimized control strategy based on the triplephase-shift (TPS) control. More specifically, the ZVS constraints and each effective operation modes are taken into consider during the training process of the Q-learning algorithm. Therefore, the minimum-current-stress scheme while maintaining the soft switching can be obtained after the first stage. In the second stage, the training results of the Q-learning algorithm are used to train an ANN, in order to reduce the computational time and memory allocation. After that, the trained agent of the ANN which likes an implicit function can provide optimal phase-shift-angles online in real-time under entire continuous operation range. Finally, the detailed simulation and experimental results are given to demonstrate the effectiveness of the proposed optimized scheme. 1
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.