Bingham Policy Parameterization for 3D Rotations in Reinforcement Learning

James, S. Jill; Abbeel, Pieter

doi:10.48550/arxiv.2202.03957

Cited by 4 publications

(10 citation statements)

References 28 publications

(49 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In our case, orientation is represented by unit quaternions. Figure 3 shows the results of learning the orientation represented as a unit quaternion using Gaussian Policy Parameterization (GPP), Tangent Space Gaussian Policy Parameterization (TSGPP), and Bingham Policy Parameterization (BPP) [34]. The quality of the learned policy using TSGPP was better than GPP for both SAC [21] and PPO [22], while compared to BPP a slightly better policy was learned for SAC and a comparable policy was learned for PPO.…”

Section: Resultsmentioning

confidence: 99%

“…As already noted in [34], BPP parameterization relies on the prediction from multiple neural networks and this may introduce significant approximation errors. Unlike GPP and TSGPP, this culminates in an unstable learning process.…”

Section: Discussionmentioning

confidence: 99%

“…Although there are many existing works in the field of LfD and supervised learning, Riemannian geometry has not been exploited in RL. A recent work, Bingham Policy Parameterization (BPP) [34], uses the Bingham distribution as an alternative to the Gaussian distribution for learning orientation policies. This choice was motivated by the argument that unit quaternions can be directly sampled from the Bingham distribution unlike the Gaussian distribution, where one must use normalization.…”

Section: Related Workmentioning

confidence: 99%

“…This choice was motivated by the argument that unit quaternions can be directly sampled from the Bingham distribution unlike the Gaussian distribution, where one must use normalization. Nevertheless, authors in [34] reported that as their implementation uses several neural networks, instability in the learning process could occur if erroneous data is sampled from them. We experimentally compare the performance of our approach and BPP in Sec.…”

Section: Related Workmentioning

confidence: 99%

See 3 more Smart Citations

Geometric Reinforcement Learning: The Case of Cartesian Space Orientation

Alhousani¹,

Saveriano²,

Sevinc³

et al. 2022

Preprint

View full text Add to dashboard Cite

Reinforcement learning (RL) enables an agent to learn by trial and error while interacting with a dynamic environment. Traditionally, RL is used to learn and predict Euclidean robotic manipulation skills like positions, velocities, and forces. However, in robotics, it is common to have non-Euclidean data like orientation or stiffness, and neglecting their geometric nature can adversely affect learning performance and accuracy. In this paper, we propose a novel framework for RL by using Riemannian geometry, and show how it can be applied to learn manipulation skills with a specific geometric structure (e.g., robot's orientation in the task space). The proposed framework is suitable for any policy representation and is independent of the algorithm choice. Specifically, we propose to apply policy parameterization and learning on the tangent space, then map the learned actions back to the appropriate manifold (e.g., the S 3 manifold for orientation). Therefore, we introduce a geometrically grounded pre-and post-processing step into the typical RL pipeline, which opens the door to all algorithms designed for Euclidean space to learn from non-Euclidean data without changes. Experimental results, obtained both in simulation and on a real robot, support our hypothesis that learning on the tangent space is more accurate and converges to a better solution than approximating non-Euclidean data.

show abstract

Section: Resultsmentioning

confidence: 99%

Section: Discussionmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

See 2 more Smart Citations

Geometric Reinforcement Learning: The Case of Cartesian Space Orientation

Alhousani¹,

Saveriano²,

Sevinc³

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

“…However, this is not the first work to propose extensions to ARM. ARM [12], has recently been extended to a Bingham policy parameterization [18] to improve training stability. Another extension has sought to improve the control agent within coarse-tofine ARM to use learned path ranking [19] to overcome the weaknesses of traditional path planning.…”

Section: Related Workmentioning

confidence: 99%

Coarse-to-fine Q-attention with Tree Expansion

James¹,

Abbeel²

2022

Preprint

Self Cite

View full text Add to dashboard Cite

Coarse-to-fine Q-attention enables sample-efficient robot manipulation by discretizing the translation space in a coarse-to-fine manner, where the resolution gradually increases at each layer in the hierarchy. Although effective, Q-attention suffers from "coarse ambiguity" -when voxelization is significantly coarse, it is not feasible to distinguish similar-looking objects without first inspecting at a finer resolution. To combat this, we propose to envision Q-attention as a tree that can be expanded and used to accumulate value estimates across the top-k voxels at each Q-attention depth. When our extension, Q-attention with Tree Expansion (QTE), replaces standard Qattention in the Attention-driven Robot Manipulation (ARM) system, we are able to accomplish a larger set of tasks; especially on those that suffer from "coarse ambiguity". In addition to evaluating our approach across 12 RLBench tasks, we also show that the improved performance is visible in a real-world task involving small objects. Videos and code found at: https: //sites.google.com/view/q-attention-qte.

show abstract

Geometric Reinforcement Learning for Robotic Manipulation

Alhousani,

Saveriano,

Sevinc

et al. 2023

IEEE Access

View full text Add to dashboard Cite

Reinforcement learning (RL) is a popular technique that allows an agent to learn by trial and error while interacting with a dynamic environment. The traditional Reinforcement Learning (RL) approach has been successful in learning and predicting Euclidean robotic manipulation skills such as positions, velocities, and forces. However, in robotics, it is common to encounter non-Euclidean data such as orientation or stiffness, and failing to account for their geometric nature can negatively impact learning accuracy and performance. In this paper, to address this challenge, we propose a novel framework for RL that leverages Riemannian geometry, which we call Geometric Reinforcement Learning (G-RL), to enable agents to learn robotic manipulation skills with non-Euclidean data. Specifically, G-RL utilizes the tangent space in two ways: a tangent space for parameterization and a local tangent space for mapping to a non-Euclidean manifold. The policy is learned in the parameterization tangent space, which remains constant throughout the training. The policy is then transferred to the local tangent space via parallel transport and projected onto the non-Euclidean manifold. The local tangent space changes over time to remain within the neighborhood of the current manifold point, reducing the approximation error. Therefore, by introducing a geometrically grounded pre-and post-processing step into the traditional RL pipeline, our G-RL framework enables several model-free algorithms designed for Euclidean space to learn from non-Euclidean data without modifications. Experimental results, obtained both in simulation and on a real robot, support our hypothesis that G-RL is more accurate and converges to a better solution than approximating non-Euclidean data.INDEX TERMS Learning on manifolds, policy optimization, policy search, geometric reinforcement learning.

show abstract

Bingham Policy Parameterization for 3D Rotations in Reinforcement Learning

Cited by 4 publications

References 28 publications

Geometric Reinforcement Learning: The Case of Cartesian Space Orientation

Geometric Reinforcement Learning: The Case of Cartesian Space Orientation

Coarse-to-fine Q-attention with Tree Expansion

Geometric Reinforcement Learning for Robotic Manipulation

Contact Info

Product

Resources

About