2021
DOI: 10.48550/arxiv.2106.12534
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Coarse-to-Fine Q-attention: Efficient Learning for Visual Robotic Manipulation via Discretisation

Abstract: Reflecting on the last few years, the biggest breakthroughs in deep reinforcement learning (RL) have been in the discrete action domain. Robotic manipulation, however, is inherently a continuous control environment, but these continuous control reinforcement learning algorithms often depend on actor-critic methods that are sample-inefficient and inherently difficult to train, due to the joint optimisation of the actor and critic. To that end, we explore how we can bring the stability of discrete action RL algo… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
1

Relationship

1
0

Authors

Journals

citations
Cited by 1 publication
(2 citation statements)
references
References 23 publications
0
2
0
Order By: Relevance
“…Continuous control RL policies are commonly parameterized as Gaussians with diagonal covariance matrices [31,30,11], though other parameterizations have been considered, including Gaussians with covariance matrix via the Cholesky factor [1], Gaussian mixtures [38], Beta distributions [3], and Bernoulli distributions [32]. Rather than directly outputting continuous values, an alternative way of parameterizing a continuous control policy is via discretization, whether through growing action spaces [4] or coarse-to-fine networks [16]. All of these works share a common goal of moving away from the conventional Gaussian parameterization, but none are ideal when faced with an action space that requires rotation predictions.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Continuous control RL policies are commonly parameterized as Gaussians with diagonal covariance matrices [31,30,11], though other parameterizations have been considered, including Gaussians with covariance matrix via the Cholesky factor [1], Gaussian mixtures [38], Beta distributions [3], and Bernoulli distributions [32]. Rather than directly outputting continuous values, an alternative way of parameterizing a continuous control policy is via discretization, whether through growing action spaces [4] or coarse-to-fine networks [16]. All of these works share a common goal of moving away from the conventional Gaussian parameterization, but none are ideal when faced with an action space that requires rotation predictions.…”
Section: Related Workmentioning
confidence: 99%
“…Deep reinforcement learning (RL) is now actively used in many areas, including playing games [25,33], robot manipulation [24,16], and legged robotics [13,29]. The leading (general-purpose) algorithms within the continuous control RL community are either deterministic, such as DDPG [23] and TD3 [5], or stochastic, such as SAC [11] and PPO [31].…”
Section: Introductionmentioning
confidence: 99%