“…Considering the above provisions from Equations (1)-(3), the modeling approaches of all literatures introduced above except for reference [19] are not quite suitable for the actual situation of the day-ahead electricity market. That is because every participant in market neither has information about the cost and revenue functions of all its rivals (then do not know their profits) nor has information about the ongoing and historical strategies of all rivals, or even the probability distribution functions of strategy choosing by rivals.…”
Section: Literature Review and Main Contributionsmentioning
confidence: 99%
“…That is because every participant in market neither has information about the cost and revenue functions of all its rivals (then do not know their profits) nor has information about the ongoing and historical strategies of all rivals, or even the probability distribution functions of strategy choosing by rivals. The modeling and simulating approach in [19] does not require that information, and an agent representing a participant learns the best strategy while meeting with a certain state of the market (the MCP formed in last iteration) through its experiences of the past. In the literature from Salehizadeh [24], an agent-based fuzzy Q-learning algorithm was used for modeling the dynamic bidding strategy adjustment of GenCOs in a spot electricity market by considering renewable power penetration, in which the fuzzy rule was used to define the continuously changing states of renewable power production.…”
Section: Literature Review and Main Contributionsmentioning
confidence: 99%
“…Azadeh et al [18] simulated the dynamic adjustment process of GenCOs in day-ahead market through multi-agent-based method. In the literature from Rahimiyan et al [19], a GenCO's optimal bidding strategy problem was modeled and simulated by the Q-learning algorithm considering discrete state as well as action sets and the game model-based approach, respectively. Comparison of those two methods confirms the superiority of Q-learning in this issue.…”
Section: Literature Review and Main Contributionsmentioning
An important goal of China's electric power system reform is to create a double-side day-ahead wholesale electricity market in the future, where the suppliers (represented by GenCOs) and demanders (represented by DisCOs) compete simultaneously with each other in one market. Therefore, modeling and simulating the dynamic bidding process and the equilibrium in the double-side day-ahead electricity market scientifically is not only important to some developed countries, but also to China to provide a bidding decision-making tool to help GenCOs and DisCOs obtain more profits in market competition. Meanwhile, it can also provide an economic analysis tool to help government officials design the proper market mechanisms and policies. The traditional dynamic game model and table-based reinforcement learning algorithm have already been employed in the day-ahead electricity market modeling. However, those models are based on some assumptions, such as taking the probability distribution function of market clearing price (MCP) and each rival's bidding strategy as common knowledge (in dynamic game market models), and assuming the discrete state and action sets of every agent (in table-based reinforcement learning market models), which are no longer applicable in a realistic situation. In this paper, a modified reinforcement learning method, called gradient descent continuous Actor-Critic (GDCAC) algorithm was employed in the double-side day-ahead electricity market modeling and simulation. This algorithm can not only get rid of the abovementioned unrealistic assumptions, but also cope with the Markov decision-making process with continuous state and action sets just like the real electricity market. Meanwhile, the time complexity of our proposed model is only O(n). The simulation result of employing the proposed model in the double-side day-ahead electricity market shows the superiority of our approach in terms of participant's profit or social welfare compared with traditional reinforcement learning methods.
“…Considering the above provisions from Equations (1)-(3), the modeling approaches of all literatures introduced above except for reference [19] are not quite suitable for the actual situation of the day-ahead electricity market. That is because every participant in market neither has information about the cost and revenue functions of all its rivals (then do not know their profits) nor has information about the ongoing and historical strategies of all rivals, or even the probability distribution functions of strategy choosing by rivals.…”
Section: Literature Review and Main Contributionsmentioning
confidence: 99%
“…That is because every participant in market neither has information about the cost and revenue functions of all its rivals (then do not know their profits) nor has information about the ongoing and historical strategies of all rivals, or even the probability distribution functions of strategy choosing by rivals. The modeling and simulating approach in [19] does not require that information, and an agent representing a participant learns the best strategy while meeting with a certain state of the market (the MCP formed in last iteration) through its experiences of the past. In the literature from Salehizadeh [24], an agent-based fuzzy Q-learning algorithm was used for modeling the dynamic bidding strategy adjustment of GenCOs in a spot electricity market by considering renewable power penetration, in which the fuzzy rule was used to define the continuously changing states of renewable power production.…”
Section: Literature Review and Main Contributionsmentioning
confidence: 99%
“…Azadeh et al [18] simulated the dynamic adjustment process of GenCOs in day-ahead market through multi-agent-based method. In the literature from Rahimiyan et al [19], a GenCO's optimal bidding strategy problem was modeled and simulated by the Q-learning algorithm considering discrete state as well as action sets and the game model-based approach, respectively. Comparison of those two methods confirms the superiority of Q-learning in this issue.…”
Section: Literature Review and Main Contributionsmentioning
An important goal of China's electric power system reform is to create a double-side day-ahead wholesale electricity market in the future, where the suppliers (represented by GenCOs) and demanders (represented by DisCOs) compete simultaneously with each other in one market. Therefore, modeling and simulating the dynamic bidding process and the equilibrium in the double-side day-ahead electricity market scientifically is not only important to some developed countries, but also to China to provide a bidding decision-making tool to help GenCOs and DisCOs obtain more profits in market competition. Meanwhile, it can also provide an economic analysis tool to help government officials design the proper market mechanisms and policies. The traditional dynamic game model and table-based reinforcement learning algorithm have already been employed in the day-ahead electricity market modeling. However, those models are based on some assumptions, such as taking the probability distribution function of market clearing price (MCP) and each rival's bidding strategy as common knowledge (in dynamic game market models), and assuming the discrete state and action sets of every agent (in table-based reinforcement learning market models), which are no longer applicable in a realistic situation. In this paper, a modified reinforcement learning method, called gradient descent continuous Actor-Critic (GDCAC) algorithm was employed in the double-side day-ahead electricity market modeling and simulation. This algorithm can not only get rid of the abovementioned unrealistic assumptions, but also cope with the Markov decision-making process with continuous state and action sets just like the real electricity market. Meanwhile, the time complexity of our proposed model is only O(n). The simulation result of employing the proposed model in the double-side day-ahead electricity market shows the superiority of our approach in terms of participant's profit or social welfare compared with traditional reinforcement learning methods.
“…A simulation framework is designed using agentbased modeling of electricity systems to test the Wholesale Power Market Platform proposed by the US Federal Energy Regulatory Commission (Sun and Tesfatsion, 2007). The supplier agent's bidding problem is modeled as a self-play problem using the Q-Learning (QL) algorithm, and its performance is compared with that of proposed model-based approach (Rahimiyan and Rajabi Mashhadi, 2008). Also, the effect of the power suppliers' market power on their bidding strategies is evaluated under pay as bid auction in the Iran electricity market.…”
Section: Article In Pressmentioning
confidence: 99%
“…Each PSA is an intelligent agent who chooses the best strategy in competition with rivals by learning from past experiences. In the introduced ACE structure, each PSA's learning behavior is modeled using QL algorithm applied by the authors (Rahimiyan and Rajabi Mashhadi, 2008).…”
a b s t r a c tChoosing a desired policy for divestiture of dominant firms' generation assets has been a challenging task and open question for regulatory authority. To deal with this problem, in this paper, an analytical method and agent-based computational economics (ACE) approach are used for ex-ante analysis of divestiture policy in reducing market power. The analytical method is applied to solve a designed concentration boundary problem, even for situations where the cost data of generators are unknown.The concentration boundary problem is the problem of minimizing or maximizing market concentration subject to operation constraints of the electricity market. It is proved here that the market concentration corresponding to operation condition is certainly viable in an interval calculated by the analytical method. For situations where the cost function of generators is available, the ACE is used to model the electricity market. In ACE, each power producer's profit-maximization problem is solved by the computational approach of Q-learning. The power producer using the Q-learning method learns from past experiences to implicitly identify the market power, and find desired response in competing with the rivals. Both methods are applied in a multi-area power system and effects of different divestiture policies on market behavior are analyzed.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.