Modeling-Learning-Based Actor-Critic Algorithm with Gaussian Process Approximator

Zhong, Shuncong; Tan, Jian; Dong, Husheng; Chen, Xuemei; Gong, Shengrong; Qian, Zhenjiang

doi:10.1007/s10723-020-09512-4

Cited by 11 publications

(4 citation statements)

References 31 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…However, the actor-critic method is the one that can help the learning process with fewer sample and computational resources and combines the advantages of both Monte Carlo policy gradient and value-based methods. 30 Various other advanced algorithms have been introduced to overcome the shortcomings of the above-explained algorithms, such as advantage actor-critic (A2C), asynchronous advantage actor-Critic (A3C), double DQN (DDQN), trust region policy optimization (TRPO), and proximal policy optimization (PPO). Still, we have not found the usage of any such algorithms in any of the surveyed papers.…”

Section: Pros and Cons Of Policy-based Methodsmentioning

confidence: 99%

“…Policy‐based methods such as Monte Carlo can learn stochastic policies rather than deterministic policy, which has proved beneficial in some situations, but it has higher variations in its sample estimations, which slow down the overall training process. However, the actor–critic method is the one that can help the learning process with fewer sample and computational resources and combines the advantages of both Monte Carlo policy gradient and value‐based methods 30 …”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Toward intelligent resource management in dynamic Fog Computing‐based Internet of Things environment with Deep Reinforcement Learning: A survey

Gupta

Singh

2022

Int J Communication

View full text Add to dashboard Cite

Fog computing has already started to gain a lot of momentum in the industry for its ability to turn scattered computing resources into a large-scale, virtualized, and elastic computing environment. Resource management (RM) is one of the key challenges in fog computing which is also related to the success of fog computing. Deep learning has been applied to the fog computing field for some time, and it is widely used in large-scale network RM. Reinforcement learning (RL) is a type of machine learning algorithms, and it can be used to learn and make decisions based on reward signals that are obtained from interactions with the environment. We examine current research in this area, comparing RL and deep reinforcement learning (DRL) approaches with traditional algorithmic methods such as graph theory, heuristics, and greedy for managing resources in fog computing environments (published between 2013 and 2022) illustrating how RL and DRL algorithms can be more effective than conventional techniques. Various algorithms based on DRL has been shown to be applicable to RM problem and proved that it has a lot of potential in fog computing. A new microservice model based on the DRL framework is proposed to achieve the goal of efficient fog computing RM. The positive impact of this work is that it can successfully provide a resource manager to efficiently schedule resources and maximize the overall performance. K E Y W O R D S AoI (age of information), deep reinforcement learning (DRL), fog computing (FC), Internet of Things (IoT), Markov decision process, Q-learning

show abstract

Section: Pros and Cons Of Policy-based Methodsmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Toward intelligent resource management in dynamic Fog Computing‐based Internet of Things environment with Deep Reinforcement Learning: A survey

Gupta

Singh

2022

Int J Communication

View full text Add to dashboard Cite

show abstract

“…Ma et al [13] proposed a decision-making framework titled "Plan-Decision-Action" for autonomous vehicles at complex urban intersections. Zhong et al [14] proposed a model-learningbased actor-critic algorithm with the Gaussian process approximator to solve the problems with continuous state and action spaces. Xiong et al [15] used a Hidden Markov model to predict other vehicles' intentions and built a decision-making model for vehicles at intersections.…”

Section: Introductionmentioning

confidence: 99%

A Decision-Making Model for Autonomous Vehicles at Urban Intersections Based on Conflict Resolution

Wang

Chen

Wang

et al. 2021

Journal of Advanced Transportation

View full text Add to dashboard Cite

The decision-making models that are able to deal with complex and dynamic urban intersections are critical for the development of autonomous vehicles. A key challenge in operating autonomous vehicles robustly is to accurately detect the trajectories of other participants and to consider safety and efficiency concurrently into interactions between vehicles. In this work, we propose an approach for developing a tactical decision-making model for vehicles which is capable of predicting the trajectories of incoming vehicles and employs the conflict resolution theory to model vehicle interactions. The proposed algorithm can help autonomous vehicles cross intersections safely. Firstly, Gaussian process regression models were trained with the data collected at intersections using subgrade sensors and a retrofit autonomous vehicle to predict the trajectories of incoming vehicles. Then, we proposed a multiobjective optimization problem (MOP) decision-making model based on efficient conflict resolution theory at intersections. After that, a nondominated sorting genetic algorithm (NSGA-II) and deep deterministic policy gradient (DDPG) are employed to select the optimal motions in comparison with each other. Finally, a simulation and verification platform was built based on Matlab/Simulink and PreScan. The reliability and effectiveness of the tactical decision-making model was verified by simulations. The results indicate that DDPG is more reliable and effective than NSGA-II to solve the MOP model, which provides a theoretical basis for the in-depth study of decision-making in a complex and uncertain intersection environment.

show abstract

“…Reinforcement learning (RL) emphasizes that the agent W learns the best strategy to interacts with the environment, so as to obtain the maximum cumulative reward. RL algorithms include the value-based algorithms [11], [12] and the policy-based algorithms [13], [14]. The classic value function algorithm is the Q-Learning algorithm [15].…”

Section: Introductionmentioning

confidence: 99%

Approximating Stackelberg Equilibrium in Anti-UAV Jamming Markov Game with Hierarchical Multi-Agent Deep Reinforcement Learning Algorithm

Feng

Huang

et al. 2021

Preprint

View full text Add to dashboard Cite

In order to avoid the malicious jamming of the intelligent unmanned aerial vehicle (UAV) to ground users in the downlink communications, a new anti-UAV jamming strategy based on multi-agent deep reinforcement learning is studied in this paper. In this method, ground users aim to learn the best mobile strategies to avoid the jamming of UAV. The problem is modeled as a Stackelberg game to describe the competitive interaction between the UAV jammer (leader) and ground users (followers). To reduce the computational cost of equilibrium solution for the complex game with large state space, a hierarchical multi-agent proximal policy optimization (HMAPPO) algorithm is proposed to decouple the hybrid game into several sub-Markov games, which updates the actor and critic network of the UAV jammer and ground users at different time scales. Simulation results suggest that the hierarchical multi-agent proximal policy optimization -based anti-jamming strategy achieves comparable performance with lower time complexity than the benchmark strategies. The well-trained HMAPPO has the ability to obtain the optimal jamming strategy and the optimal anti-jamming strategies, which can approximate the Stackelberg equilibrium (SE).

show abstract

Modeling-Learning-Based Actor-Critic Algorithm with Gaussian Process Approximator

Cited by 11 publications

References 31 publications

Toward intelligent resource management in dynamic Fog Computing‐based Internet of Things environment with Deep Reinforcement Learning: A survey

Toward intelligent resource management in dynamic Fog Computing‐based Internet of Things environment with Deep Reinforcement Learning: A survey

A Decision-Making Model for Autonomous Vehicles at Urban Intersections Based on Conflict Resolution

Approximating Stackelberg Equilibrium in Anti-UAV Jamming Markov Game with Hierarchical Multi-Agent Deep Reinforcement Learning Algorithm

Contact Info

Product

Resources

About