2012
DOI: 10.1016/j.isatra.2012.06.010
|View full text |Cite
|
Sign up to set email alerts
|

Optimal control in microgrid using multi-agent reinforcement learning

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
32
0

Year Published

2012
2012
2023
2023

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 70 publications
(34 citation statements)
references
References 19 publications
0
32
0
Order By: Relevance
“…Q*(s,a) is defined as follows [13]: Where s is current state; s ' is next state; is discount factor; Pss ' (s ' |s,a) is the probability to reach state s' if action a is done in state s; r(s,s ' ,a) is reward value the agent will get. Pss ' (s ' |s,a) and r(s,s ' ,a) is uncertainty since the selected action is unknown the next moment, so each Q(s,a) approximates Q*(s,a) by iteration.…”
Section: Q-learing Algorithmmentioning
confidence: 99%
See 1 more Smart Citation
“…Q*(s,a) is defined as follows [13]: Where s is current state; s ' is next state; is discount factor; Pss ' (s ' |s,a) is the probability to reach state s' if action a is done in state s; r(s,s ' ,a) is reward value the agent will get. Pss ' (s ' |s,a) and r(s,s ' ,a) is uncertainty since the selected action is unknown the next moment, so each Q(s,a) approximates Q*(s,a) by iteration.…”
Section: Q-learing Algorithmmentioning
confidence: 99%
“…Under grid-connected mode, dynamic hierarchical reinforcement learning is established to minimize electricity costs on the premise of satisfying generation limits of units and power balance among power production and consumption in a microgrid [12]. A two steps-ahead reinforcement learning algorithm is proposed to make use of time-dependent environmental experience and optimize the battery scheduling in an energy management system for a microgrid [13].…”
Section: Introductionmentioning
confidence: 99%
“…In contrast to the conventional distributed methods, learning-based methods can be easily adapted with a real-time problem after the off-line training process. In RL, Q-learning is a popular method and is widely used for the optimal operation of microgrids [19][20][21][22][23]. A fitted Q-iteration-based algorithm has been proposed in [19] for a BESS.…”
Section: Introductionmentioning
confidence: 99%
“…By using this method, the utilization rate of the battery is increased during high electricity demand while the utilization rate of the wind turbine for local demand is also increased to reduce the consumer dependence on the utility grid. The authors in [23] have presented an improved RL method to minimize the operation cost of an MG in the grid-connected mode.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation