Dawei Qiu (S'18) is currently pursuing the Ph.D. degree at Imperial College London, London, U.K. His current research interests include game-theoretic and agent-based modeling in wholesale as well as retail electricity markets. Mingyang Sun (M'16) received the Ph.D. degree from Imperial College London, London, U.K., in 2017. He is currently a Research Associate in this institution. His current research interests include big data analytics and artificial intelligence in energy systems. Dimitrios Papadaskalopoulos (M'13) is a Research Fellow at Imperial College London, London, U.K. His current research focuses on the development and application of distributed and market-based approaches for the coordination of operation and planning decisions in power systems, employing optimization and game theoretic principles. Goran Strbac (M'95) is a Professor of Electrical Energy Systems at Imperial College London, London, U.K. His research interests include electricity system operation, investment and pricing, and integration of renewable generation and distributed energy resources.
Lagrangian multipliers associated with the capacity constraints of transmission line (,) (£/MW) , − , , + Lagrangian multipliers associated with the voltage angle constraints at node and period (£/rad) Lagrangian multiplier associated with the voltage angle value at the reference node (£/rad)
Previously works on analysing imperfect electricity markets have employed conventional game-theoretic approaches. However, such approaches necessitate that each strategic market player has full knowledge of the operating parameters and the strategies of its rivals as well as the computational algorithm of the market clearing process. This unrealistic assumption, along with the modeling and computational complexities, renders such approaches less applicable for conducting practical multi-period and multispatial equilibrium analysis. This paper proposes a novel multi-agent deep reinforcement learning (MA-DRL) based methodology, combining multi-agent intelligence, the deep policy gradient (DPG) method, and an innovative long short term memory (LSTM) based representation network for optimizing the offering strategies of multiple self-interested generation companies (GENCOs) as well as exploring the market outcome stemming from their interactions. The proposed approach is tailored to align with the nature of the examined problem by posing it, for the first time, in multi-dimensional continuous state and action spaces, enabling GENCOs to receive accurate feedback regarding the impact of their offering strategies on the market clearing outcome, and devise more profitable bidding decisions by exploiting the entire action domain, and thereby facilitates more accurate equilibrium analysis. The proposed LSTM-based representation network extracts discriminative features which further improves the learning performance and thus promises more profitable offerings strategies for each GENCO. Case studies demonstrate that the proposed method i) achieves a significantly higher profit than state-of-the-art RL methods for a single GENCO's optimal offering strategy problem and ii) outperforms the state-of-the-art equilibrium programming models in efficiently identifying an imperfect market equilibrium with / without network congestion. Quantitative economic analysis is carried out on the obtained equilibrium. INDEX TERMS Deep neural networks, deep reinforcement learning, electricity markets, equilibrium programming, imperfect competition, multi-agent intelligence, strategic offering. NOMENCLATURE A. INDICES AND SETS i Ramp up / down limit of GENCO i (MW). D j,h Power input of demand j at hour h (MW). C. VARIABLES θ n,h Voltage angle at node n and period h (rad). o i,h Strategic offering variable of GENCO i at hour h. g i,h,b Power output of block b of GENCO i at hour h (MW). λ n,h Locational marginal price at node n hour h (£/MWh).
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.