2021
DOI: 10.1109/tsg.2020.3011739
|View full text |Cite
|
Sign up to set email alerts
|

Multi-Agent Deep Reinforcement Learning for HVAC Control in Commercial Buildings

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
57
1

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
2
1

Relationship

1
5

Authors

Journals

citations
Cited by 167 publications
(59 citation statements)
references
References 35 publications
0
57
1
Order By: Relevance
“…Since DRL problems are mainly based on Markov Decision Process (MDP) framework or its variants (e.g., Partially observable MDP [30], Markov games [17]), we first introduce the background of MDP. Typically, an MDP is defined by a fivetuple (S, A, P, R, γ), where S and A denote the sets of state and action, respectively.…”
Section: A Mdpmentioning
confidence: 99%
See 4 more Smart Citations
“…Since DRL problems are mainly based on Markov Decision Process (MDP) framework or its variants (e.g., Partially observable MDP [30], Markov games [17]), we first introduce the background of MDP. Typically, an MDP is defined by a fivetuple (S, A, P, R, γ), where S and A denote the sets of state and action, respectively.…”
Section: A Mdpmentioning
confidence: 99%
“…Since there are many DRL methods in the [47]. For most of existing works on DRL for building energy systems, model-free methods have been used and can be further classified into several types as in [48] [63], Advantage Actor-Critic (A2C) [64], Asynchronous Advantage Actor-Critic (A3C) [65]), and maximum entropy methods (e.g., Multi-Actor Attention-Critic (MAAC) [17], Entropy-Based Collective Advantage Actor-Critic (EB-C-A2C) [27], Entropy-Based Collective Deep Q-Network (EB-C-DQN) [27]). In above-mentioned methods, Q-learning methods do not support continuous actions.…”
Section: Drl Classificationmentioning
confidence: 99%
See 3 more Smart Citations