“…Zhen et al [15] model the economic dispatch as 1-step Markov decision process (MDP), i.e., the solution is not generated iteratively, but in a one-shot fashion. They use the Twin-Delayed DDPG (TD3) algorithm to learn to minimize generation costs under multiple constraints.…”
Section: B Learning the Optimal Power Flowmentioning
confidence: 99%
“…Therefore, the agent has perfect knowledge of Q, if it knows the reward function. As Zhen et al [15] showed, the OPF approximation can be implemented as a 1-step environment, because the solution of one OPF is independent of the solution of the previous OPF. Exceptions are multi-step OPF problems where the optimization is done over multiple time steps, e.g., when storage systems are part of the optimization.…”
Future smart grids can and will be subject of systematic attacks that can result in monetary costs and reduced system stability. These attacks are not necessarily malicious, but can be economically motivated as well. Emerging flexibility markets are of interest here, because they can incite attacks if market design is flawed. The dimension and danger potential of such strategies is still unknown. Automatic analysis tools are required to systematically search for unknown strategies and their respective countermeasures. We propose deep reinforcement learning to learn attack strategies autonomously to identify underlying systemic vulnerabilities this way. As a proof-of-concept, we apply our approach to a reactive power market setting in a distribution grid. In the case study, the attacker learned to exploit the reactive power market by using controllable loads. That was done by systematically inducing constraint violations into the system and then providing countermeasures on the flexibility market to generate profit, thus finding a hitherto unknown attack strategy. As a weak-point, we identified the optimal power flow that was used for market clearing. Our general approach is applicable to detect unknown attack vectors, to analyze a specific power system regarding vulnerabilities, and to systematically evaluate potential countermeasures.
“…Zhen et al [15] model the economic dispatch as 1-step Markov decision process (MDP), i.e., the solution is not generated iteratively, but in a one-shot fashion. They use the Twin-Delayed DDPG (TD3) algorithm to learn to minimize generation costs under multiple constraints.…”
Section: B Learning the Optimal Power Flowmentioning
confidence: 99%
“…Therefore, the agent has perfect knowledge of Q, if it knows the reward function. As Zhen et al [15] showed, the OPF approximation can be implemented as a 1-step environment, because the solution of one OPF is independent of the solution of the previous OPF. Exceptions are multi-step OPF problems where the optimization is done over multiple time steps, e.g., when storage systems are part of the optimization.…”
Future smart grids can and will be subject of systematic attacks that can result in monetary costs and reduced system stability. These attacks are not necessarily malicious, but can be economically motivated as well. Emerging flexibility markets are of interest here, because they can incite attacks if market design is flawed. The dimension and danger potential of such strategies is still unknown. Automatic analysis tools are required to systematically search for unknown strategies and their respective countermeasures. We propose deep reinforcement learning to learn attack strategies autonomously to identify underlying systemic vulnerabilities this way. As a proof-of-concept, we apply our approach to a reactive power market setting in a distribution grid. In the case study, the attacker learned to exploit the reactive power market by using controllable loads. That was done by systematically inducing constraint violations into the system and then providing countermeasures on the flexibility market to generate profit, thus finding a hitherto unknown attack strategy. As a weak-point, we identified the optimal power flow that was used for market clearing. Our general approach is applicable to detect unknown attack vectors, to analyze a specific power system regarding vulnerabilities, and to systematically evaluate potential countermeasures.
“…In the context of ACOPF, neural networks can either be trained by imitation (supervised learning) or by interaction with a simulator through Reinforcement Learning (RL) [7]. Recent work explores the application of deep neural networks to ACOPF [8], while others [9], [10], [11], [12] frame the ACOPF problem as a closed-loop RL problem.…”
“…After training, the AI-based agent can adjust power flow states rapidly and is suitable for online applications. An RLbased optimal power flow solution method has been proposed in [58] using PSOPS and the twin-delayed deep deterministic policy gradient (TD3) algorithm [59]. In this paper, a TD3-based SOPF solution program is realized using Py_PSOPS.…”
With the rapid development of artificial intelligence (AI), it is foreseeable that the accuracy and efficiency of dynamic analysis for future power system will be greatly improved by the integration of dynamic simulators and AI. To explore the interaction mechanism of power system dynamic simulations and AI, a general design for AI-oriented power system dynamic simulators is proposed, which consists of a high-performance simulator with neural network supportability and flexible external and internal application programming interfaces (APIs). With the support of APIs, simulation-assisted AI and AIassisted simulation form a comprehensive interaction mechanism between power system dynamic simulations and AI. A prototype of this design is implemented and made public based on a highly efficient electromechanical simulator. Tests of this prototype are carried out in four scenarios including sample generation, AI-based stability prediction, data-driven dynamic component modeling, and AI-aided stability control, which prove the validity, flexibility, and efficiency of the design and implementation for AI-oriented power system dynamic simulators.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.