2023
DOI: 10.1109/jsyst.2022.3222262
|View full text |Cite
|
Sign up to set email alerts
|

Twin Delayed Deep Deterministic Policy Gradient (TD3) Based Virtual Inertia Control for Inverter-Interfacing DGs in Microgrids

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
7
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 10 publications
(7 citation statements)
references
References 43 publications
0
7
0
Order By: Relevance
“…This additional controller, implementing VI, is integrated into the microgrid system, utilising model predictive control (MPC) with a response mechanism for VI. Using the TD3 scheme, an estimation method for inertia level is demonstrated in [105].…”
Section: Intelligent Strategiesmentioning
confidence: 99%
“…This additional controller, implementing VI, is integrated into the microgrid system, utilising model predictive control (MPC) with a response mechanism for VI. Using the TD3 scheme, an estimation method for inertia level is demonstrated in [105].…”
Section: Intelligent Strategiesmentioning
confidence: 99%
“…Additionally, TD3 applies random noise to the target policy during training, which encourages exploration and can lead to better generalization. 88 In recent years, TD3 has been applied to a variety of real-world problems with promising results. For example, in reference [89], the authors propose a TD3-based controller for the PA in the Internet of Robotic Things (IoRT) network.…”
Section: Twin Delayed Deep Deterministic Policy Gradient (Td3)mentioning
confidence: 99%
“…One key feature of TD3 is the use of two critics instead of one, which helps to better estimate the Q ‐values. Additionally, TD3 applies random noise to the target policy during training, which encourages exploration and can lead to better generalization 88 . In recent years, TD3 has been applied to a variety of real‐world problems with promising results.…”
Section: Literature Reviewmentioning
confidence: 99%
“…The probability of picking an action a t at a given state s t is represented through the policy π a t js t ð Þ. The goal of the agent is to achieve the optimal policy π Ã ,which will maximize the total long-term anticipated reward R. The assessment of benefit in reaching a particular state is determined through the state value function V π s t ð Þ 35 and is expressed as…”
Section: Working With Td3pg Control Agentsmentioning
confidence: 99%