Multi-Agent Deep Reinforcement Learning for HVAC Control in Commercial Buildings

Yu, Liang; Sun, Yi; Xu, Zhanbo; Shen, Chao; Yue, Dong; Jiang, Tao; Guan, Xiaohong

doi:10.1109/tsg.2020.3011739

Cited by 167 publications

(59 citation statements)

References 35 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Since DRL problems are mainly based on Markov Decision Process (MDP) framework or its variants (e.g., Partially observable MDP [30], Markov games [17]), we first introduce the background of MDP. Typically, an MDP is defined by a fivetuple (S, A, P, R, γ), where S and A denote the sets of state and action, respectively.…”

Section: A Mdpmentioning

confidence: 99%

“…Since there are many DRL methods in the [47]. For most of existing works on DRL for building energy systems, model-free methods have been used and can be further classified into several types as in [48] [63], Advantage Actor-Critic (A2C) [64], Asynchronous Advantage Actor-Critic (A3C) [65]), and maximum entropy methods (e.g., Multi-Actor Attention-Critic (MAAC) [17], Entropy-Based Collective Advantage Actor-Critic (EB-C-A2C) [27], Entropy-Based Collective Deep Q-Network (EB-C-DQN) [27]). In above-mentioned methods, Q-learning methods do not support continuous actions.…”

Section: Drl Classificationmentioning

confidence: 99%

“…In model-based methods, DRL agents need to learn building environment models based on historical data Long Short-Term Memory-Deep Deterministic Policy Gradients (LSTM-DDPG) [46], differentiable mode policy-Proximal Policy Optimization (differentiable MPC policy-PPO) [47]. [63], Advantage [64], Asynchronous Advantage Actor-Critic (A3C) [65]), and maximum entropy methods (e.g., Multi-Ac (MAAC) [17], Entropy-Based Collective Advantage Actor-Critic (EB-C-A2C) [27], Entropy-Based Collecti (EB-C-DQN) [27]). In above-mentioned methods, Q-learning methods do not support continuous actions.…”

Section: Applications Of Drl In a Single Buildingmentioning

confidence: 99%

“…Secondly, there are many sources of uncertainties related to SBEM [16], e.g., renewable generation output, electricity price, indoor temperature, outdoor temperature, CO 2 concentration, number of occupants and power demand of appliances. Thirdly, there are many temporally and spatially coupled operational constraints related to energy subsystems [17] [18], e.g., Heating, Ventilation, and Air Conditioning (HVAC) systems, Energy Storage Systems (ESSs), Electric Vehicles (EVs), which means that the current system decision will affect the future decisions and the decisions among different subsystems should be coordinated. Fourthly, it is difficult to solve large-scale building energy optimization problems in real-time when traditional optimization methods are adopted [19].…”

Section: Introductionmentioning

confidence: 99%

“…• DRL methods can operate in an online way without knowing any forecast information or statistics information of building environments, which can effectively over- come the challenges brought by system uncertainties and temporally-coupled constraints related to HVAC systems, ESSs, EVs and so on [16] [25]. • Multi-agent DRL methods support the flexible coordination among different building energy subsystems, which can deal with spatially-coupled operational constraints very well [17]. • DRL methods can support "end-to-end" control for largescale building energy optimization problems.…”

Section: Introductionmentioning

confidence: 99%

See 4 more Smart Citations

Deep Reinforcement Learning for Smart Home Energy Management

Xie

et al. 2020

IEEE Internet Things J.

Self Cite

260

View full text Add to dashboard Cite

In this paper, we investigate an energy cost minimization problem for a smart home in the absence of a building thermal dynamics model with the consideration of a comfortable temperature range. Due to the existence of model uncertainty, parameter uncertainty (e.g., renewable generation output, nonshiftable power demand, outdoor temperature, and electricity price) and temporally-coupled operational constraints, it is very challenging to determine the optimal energy management strategy for scheduling Heating, Ventilation, and Air Conditioning (HVAC) systems and energy storage systems in the smart home. To address the challenge, we first formulate the above problem as a Markov decision process, and then propose an energy management strategy based on Deep Deterministic Policy Gradients (DDPG). It is worth mentioning that the proposed strategy does not require the prior knowledge of uncertain parameters and building thermal dynamics model. Simulation results based on real-world traces demonstrate the effectiveness and robustness of the proposed strategy.

show abstract

Section: A Mdpmentioning

confidence: 99%

Section: Drl Classificationmentioning

confidence: 99%

Section: Applications Of Drl In a Single Buildingmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Deep Reinforcement Learning for Smart Home Energy Management

Xie

et al. 2020

IEEE Internet Things J.

Self Cite

260

View full text Add to dashboard Cite

show abstract

Reinforcement Learning for Intelligent Building Energy Management System Control *

Kotevska

Andelfinger

2022

Intelligent Data Mining and Analysis in Power and Energy Systems

View full text Add to dashboard Cite

Heating ventilation air‐conditioner system for multi‐regional commercial buildings based on deep reinforcement learning

Yang,

Yu,

Wang

2024

Adv Control Appl

View full text Add to dashboard Cite

In an era of significant energy consumption by commercial building HVAC systems, this study introduces a Deep Reinforcement Learning (DRL) approach to optimize these systems in multi‐zone commercial buildings, targeting reduced energy usage and enhanced user comfort. The research begins with the development of an energy consumption model for multi‐zone HVAC systems, considering the complexity and uncertainty of system parameters. This model informs the creation of a novel DRL‐based optimization algorithm, which incorporates multi‐stage training and a multi‐agent attention mechanism, enhancing stability and scalability. Comparative analysis against traditional control methods shows the proposed algorithm's effectiveness in reducing energy consumption while maintaining indoor comfort. The study presents an innovative DRL strategy for energy management in commercial HVAC systems, offering substantial potential for sustainable practices in building management.

show abstract

Multi-Agent Deep Reinforcement Learning for HVAC Control in Commercial Buildings

Cited by 167 publications

References 35 publications

Deep Reinforcement Learning for Smart Home Energy Management

Deep Reinforcement Learning for Smart Home Energy Management

Reinforcement Learning for Intelligent Building Energy Management System Control *

Heating ventilation air‐conditioner system for multi‐regional commercial buildings based on deep reinforcement learning

Contact Info

Product

Resources

About