Multi-Objective End-to-End Self-Driving Based on Pareto-Optimal Actor-Critic Approach

Wang, Tinghan; Luo, Yugong; Liu, Jinxin; Li, Keqiang

doi:10.1109/itsc48978.2021.9564464

Cited by 3 publications

(7 citation statements)

References 12 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…( 4). This intuition is in line with the Pareto optimality discussed in [9], which prescribes to update only when the gradient ascent directions (advantage functions) corresponding to all objectives are the same. Updating in the same gradient ascent direction will discover new undominated points on the Pareto front.…”

Section: B Deep Morlsupporting

confidence: 55%

“…A2C uses a function called the advantage function for policy update to address the high variance problem of its predecessor, the REINFORCE algorithm [10]. We propose a multi-objective A2C algorithm for the considered MORL problem, following the Pareto optimality approach [9]. Fig.…”

Section: B Deep Morlmentioning

confidence: 99%

“…The coefficients w E and w δ reflect the priority of the policymaker for the two objectives mentioned above. Such a flexibility is missing in [9]. Notably, the actor network is updated only when both advantage functions have the same sign (both positive or both negative), as seen in Eq.…”

Section: B Deep Morlmentioning

confidence: 99%

“…To this end, in this work, motivated by the Paretooptimal Q-learning (PQL) method [7] we propose multiobjective actor-critic method to avoid the forced conversion of DOS discomfort to monetary cost. Since we need to deal with high-dimensional state and action spaces for hospital augmentation planning, we utilize deep neural network based approximations for the MORL task, similar to [8], [9].…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Multi-Objective Reinforcement Learning Based Healthcare Expansion Planning Considering Pandemic Events

Shuvo

Symum

Ahmed

et al. 2023

IEEE J. Biomed. Health Inform.

View full text Add to dashboard Cite

Hospital capacity expansion planning is critical for a healthcare authority, especially in regions with a growing diverse population. Policymaking to this end often requires satisfying two conflicting objectives, minimizing capacity expansion cost and minimizing the number of denial of service (DOS) for patients seeking hospital admission. The uncertainty in hospital demand, especially considering a pandemic event, makes expansion planning even more challenging. This work presents a multi-objective reinforcement learning (MORL) based solution for healthcare expansion planning to optimize expansion cost and DOS simultaneously for pandemic and non-pandemic scenarios. Importantly, our model provides a simple and intuitive way to set the balance between these two objectives by only determining their priority percentages, making it suitable across policymakers with different capabilities, preferences, and needs. Specifically, we propose a multiobjective adaptation of the popular Advantage Actor-Critic (A2C) algorithm to avoid forced conversion of DOS discomfort cost to a monetary cost. Our case study for the state of Florida illustrates the success of our MORL based approach compared to the existing benchmark policies, including a state-of-the-art deep RL policy that converts DOS to economic cost to optimize a single objective.

show abstract

Section: B Deep Morlsupporting

confidence: 55%

Section: B Deep Morlmentioning

confidence: 99%

Section: B Deep Morlmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Multi-Objective Reinforcement Learning Based Healthcare Expansion Planning Considering Pandemic Events

Shuvo

Symum

Ahmed

et al. 2023

IEEE J. Biomed. Health Inform.

View full text Add to dashboard Cite

show abstract

“…Some researches have explored novel methods to achieve a balance between two conflicting optimization objectives. For example, Reymond et al [34] proposed the Pareto-DQN algorithm to estimate the Pareto front with a high-dimensional state-space and could obtain the ap-proximately real Pareto front. Wang et al [35] proposed the Pareto-optimal actor-critic method to obtain optimal policies by optimizing the coupling objectives, which was not affected by the concavity and convexity of the Pareto front.…”

Section: Uavs Deployment and Chargingmentioning

confidence: 99%

Multi-Objective Coordinated Optimization for UAV Charging Scheduling in Intelligent Aerial-Ground Perception Networks

Yi,

Xiang,

Huaguang

et al. 2023

Chinese J. Elect.

View full text Add to dashboard Cite

The unmanned aerial vehicles (UAVs)assisted intelligent traffic perception system can provide effective situation awareness. However, UAVs are required to be recharged before the energy is exhausted, which may cause task interruption. To address this concern, the charging UAV (CUAV) is employed to provide wireless charging for the mission UAVs (MUAVs). This paper studies the charging scheduling problem of the CUAV under the premise of optimizing the MUAVs deployment. We first model the MUAVs deployment problem considering the energy consumption and data transmission and establish the CUAV charging model. Then, the above problem is formulated as a multi-objective multi-agent stochastic game process to simplify the decisions-making of MUAVs and CUAV, based on which we propose the utility-based Pareto optimal deployment and charging algorithm, which reduces the computing complexity by equivalent utility of the MUAVs while using Kullback-Leibler divergence to constrain solutions. Next, to ensure the effectiveness of policy update, the multi-agent communication protocol is adopted to improve policy exploration efficiency. Simulation results show that the proposed algorithm outperforms existing works in terms of energy efficiency and charging by comparing with the Pareto front of different methods, endurance anxiety of the MUAVs, and charging utilization under different task modes.

show abstract

Graph Reinforcement Learning-Based Decision-Making Technology for Connected and Autonomous Vehicles: Framework, Review, and Future Trends

Liu,

Li,

Tang

et al. 2023

Sensors

View full text Add to dashboard Cite

The proper functioning of connected and autonomous vehicles (CAVs) is crucial for the safety and efficiency of future intelligent transport systems. Meanwhile, transitioning to fully autonomous driving requires a long period of mixed autonomy traffic, including both CAVs and human-driven vehicles. Thus, collaborative decision-making technology for CAVs is essential to generate appropriate driving behaviors to enhance the safety and efficiency of mixed autonomy traffic. In recent years, deep reinforcement learning (DRL) methods have become an efficient way in solving decision-making problems. However, with the development of computing technology, graph reinforcement learning (GRL) methods have gradually demonstrated the large potential to further improve the decision-making performance of CAVs, especially in the area of accurately representing the mutual effects of vehicles and modeling dynamic traffic environments. To facilitate the development of GRL-based methods for autonomous driving, this paper proposes a review of GRL-based methods for the decision-making technologies of CAVs. Firstly, a generic GRL framework is proposed in the beginning to gain an overall understanding of the decision-making technology. Then, the GRL-based decision-making technologies are reviewed from the perspective of the construction methods of mixed autonomy traffic, methods for graph representation of the driving environment, and related works about graph neural networks (GNN) and DRL in the field of decision-making for autonomous driving. Moreover, validation methods are summarized to provide an efficient way to verify the performance of decision-making methods. Finally, challenges and future research directions of GRL-based decision-making methods are summarized.

show abstract

Multi-Objective End-to-End Self-Driving Based on Pareto-Optimal Actor-Critic Approach

Cited by 3 publications

References 12 publications

Multi-Objective Reinforcement Learning Based Healthcare Expansion Planning Considering Pandemic Events

Multi-Objective Reinforcement Learning Based Healthcare Expansion Planning Considering Pandemic Events

Multi-Objective Coordinated Optimization for UAV Charging Scheduling in Intelligent Aerial-Ground Perception Networks

Graph Reinforcement Learning-Based Decision-Making Technology for Connected and Autonomous Vehicles: Framework, Review, and Future Trends

Contact Info

Product

Resources

About