2021
DOI: 10.48550/arxiv.2107.03996
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Learning Vision-Guided Quadrupedal Locomotion End-to-End with Cross-Modal Transformers

Abstract: We propose to address quadrupedal locomotion tasks using Reinforcement Learning (RL) with a Transformer-based model that learns to combine proprioceptive information and high-dimensional depth sensor inputs. While learning-based locomotion has made great advances using RL, most methods still rely on domain randomization for training blind agents that generalize to challenging terrains. Our key insight is that proprioceptive states only offer contact measurements for immediate reaction, whereas an agent equippe… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
10
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
4
2

Relationship

0
6

Authors

Journals

citations
Cited by 9 publications
(12 citation statements)
references
References 64 publications
(65 reference statements)
0
10
0
Order By: Relevance
“…To improve on the reactive controllers trained only with proprioceptive sensing as input, recent works are integrating vision into the deep reinforcement learning framework. This has allowed for obstacle avoidance [23] as well as more dynamic crossing of rough terrain through the use of sampling from height maps [24], [25]. Additional works have shown gap crossing [26], also with full flight phases learned from pixels and leveraging MPC [27].…”
Section: A Related Workmentioning
confidence: 99%
“…To improve on the reactive controllers trained only with proprioceptive sensing as input, recent works are integrating vision into the deep reinforcement learning framework. This has allowed for obstacle avoidance [23] as well as more dynamic crossing of rough terrain through the use of sampling from height maps [24], [25]. Additional works have shown gap crossing [26], also with full flight phases learned from pixels and leveraging MPC [27].…”
Section: A Related Workmentioning
confidence: 99%
“…The legged robots locomotion experiments are conducted on the Pybullet-Unitree-A1 [16] robot. We follow the same environment setting used by Locotransformer robot [68], including terrain shape, obstacle distributions, sensors, reward definition, and termination condition. Two agents are trained.…”
Section: Experimental Settingmentioning
confidence: 99%
“…This forms the "State w/oH " method. We also include "State+Vision" method [68], which trains a visuomotor policy for legged robot directly in the test environment, enabling it to sidestep obstacles. As shown in Table 2, with human involvement, Policy Dissection successfully transforms the "insensible" policy that is not fit to this task and greatly improves the performance, achieving comparable performance to the agent trained directly for this task.…”
Section: Human-ai Shared Control For Task Transfermentioning
confidence: 99%
See 2 more Smart Citations