Convergence and sample complexity of natural policy gradient primal-dual methods for constrained MDPs

Ding, Dong-Sheng; Zhang, Kaiqing; Duan, Jiali; Başar, Tamer; Jovanović, Mihailo R.

doi:10.48550/arxiv.2206.02346

Cited by 1 publication

(1 citation statement)

References 27 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The primal-dual methods, also called Lagrangian methods, are designed to address Lagrange dual problems. Ding et al (2022) introduced a natural policy gradient-based primal-dual method and demonstrated its convergence to an optimal policy at a specified convergence rate. Another primal-dual method, proposed by Bai et al (2022), ensures that a trained policy results in zero constraint violations during evaluation.…”

Section: Related Workmentioning

confidence: 99%

Nondominated Policy-Guided Learning in Multi-Objective Reinforcement Learning

2022

View full text Add to dashboard Cite

Control intelligence is a typical field where there is a trade-off between target objectives, and researchers in this field have longed for artificial intelligence that achieves the target objectives. Multi-objective deep reinforcement learning was sufficient to satisfy this need. In particular, multi-objective deep reinforcement learning methods based on policy optimization are leading the optimization of control intelligence. However, multi-objective reinforcement learning has difficulties when finding various Pareto optimals of multi-objectives due to the greedy nature of reinforcement learning. We propose a method of policy assimilation to solve this problem. This method was applied to MO-V-MPO, one of preference-based multi-objective reinforcement learning, to increase diversity. The performance of this method has been verified through experiments in a continuous control environment.

show abstract

Section: Related Workmentioning

confidence: 99%