CEM-GD: Cross-Entropy Method with Gradient Descent Planner for Model-Based Reinforcement Learning

Huang, Kevin; Lale, Sahin; Rosolia, Ugo; Shi, Yuanyuan; Anandkumar, Anima

doi:10.48550/arxiv.2112.07746

Cited by 2 publications

(3 citation statements)

References 0 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Remark 3. The matrix F and vector e (see (13)) have the dimension of 3(n o + 2)n p × 3n v and 3(n 0 + 2)n p and their complexity grow linearly with the number of obstacles n o and the planning horizon n p .…”

Section: Discussionmentioning

confidence: 99%

“…IV. CONNECTIONS TO EXISTING WORKS Connection to CEM-GD: Alternate approaches of combining sampling and gradient-based approach were presented recently in [4], [13]. In these two cited works, the projection at line 5 of Alg.1 is replaced with a gradient step of the form ξ i = ξ i − σ∇ ξ c 1 , for some learning-rate σ.…”

Section: Decenteralized Priest (D-priest)mentioning

confidence: 99%

“…In these two cited works, the projection at line 5 of Alg.1 is replaced with a gradient step of the form ξ i = ξ i − σ∇ ξ c 1 , for some learning-rate σ. Our approach PRIEST improves [4], [13] in two main aspects. First, it can be applied to problems with non-smooth and non-analytical cost functions.…”

Section: Decenteralized Priest (D-priest)mentioning

confidence: 99%

See 2 more Smart Citations

PRIEST: Projection Guided Sampling-Based Optimization for Autonomous Navigation

Rastgar,

Masnavi,

Sharma

et al. 2024

IEEE Robot. Autom. Lett.

View full text Add to dashboard Cite

Efficient navigation in unknown and dynamic environments is crucial for expanding the application domain of mobile robots. The core challenge stems from the nonavailability of a feasible global path for guiding optimizationbased local planners. As a result, existing local planners often get trapped in poor local minima. In this paper, we present a novel optimizer that can explore multiple homotopies to plan high-quality trajectories over long horizons while still being fast enough for real-time applications. We build on the gradientfree paradigm by augmenting the trajectory sampling strategy with a projection optimization that guides the samples toward a feasible region. As a result, our approach can recover from the frequently encountered pathological cases wherein all the sampled trajectories lie in the high-cost region. Furthermore, we also show that our projection optimization has a highly parallelizable structure that can be easily accelerated over GPUs. We push the state-of-the-art in the following respects. Over the navigation stack of the Robot Operating System (ROS), we show an improvement of 7-13% in success rate and up to two times in total travel time metric. On the same benchmarks and metrics, our approach achieves up to 44% improvement over MPPI and its recent variants. On simple point-to-point navigation tasks, our optimizer is up to two times more reliable than SOTA gradient-based solvers, as well as sampling-based approaches such as the Cross-Entropy Method (CEM) and VPSTO. Codes: https://github.com/fatemeh-rastgar/PRIEST

show abstract

Section: Discussionmentioning

confidence: 99%

Section: Decenteralized Priest (D-priest)mentioning

confidence: 99%