2020
DOI: 10.1109/lra.2020.2979660
|View full text |Cite
|
Sign up to set email alerts
|

DeepGait: Planning and Control of Quadrupedal Gaits Using Deep Reinforcement Learning

Abstract: This paper addresses the problem of legged locomotion in non-flat terrain. As legged robots such as quadrupeds are to be deployed in terrains with geometries which are difficult to model and predict, the need arises to equip them with the capability to generalize well to unforeseen situations. In this work, we propose a novel technique for training neuralnetwork policies for terrain-aware locomotion, which combines state-of-the-art methods for model-based motion planning and reinforcement learning. Our approac… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
82
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
3
2

Relationship

1
8

Authors

Journals

citations
Cited by 147 publications
(85 citation statements)
references
References 26 publications
0
82
0
Order By: Relevance
“…We decide to tackle these challenges using model-free control via Reinforcement Learning (RL), which has shown impressive results when it comes to the control of complex motions. It has been used to solve a Rubik's cube with a robotic hand [16], learn locomotion on complex terrains, [17], [18], play table tennis [19], teach robots to imitate animals [20] and stand up from arbitrary initial conditions [21]. In addition to its capacity to solve complex tasks, once trained, RL has the advantage of requiring much less computation than optimization methods.…”
Section: High Inertia Feetmentioning
confidence: 99%
“…We decide to tackle these challenges using model-free control via Reinforcement Learning (RL), which has shown impressive results when it comes to the control of complex motions. It has been used to solve a Rubik's cube with a robotic hand [16], learn locomotion on complex terrains, [17], [18], play table tennis [19], teach robots to imitate animals [20] and stand up from arbitrary initial conditions [21]. In addition to its capacity to solve complex tasks, once trained, RL has the advantage of requiring much less computation than optimization methods.…”
Section: High Inertia Feetmentioning
confidence: 99%
“…Every module, with the exception of the burst generators, independently and continuously seeks to match its perception with its goal value through corrective outputs in real time. Unlike designs that only implement feedback control at the level of joint control and some form of feed-forward computation above that to generate behavior (11,14, 22, 25, 28, 32, 43, 45, 50), our design uses closed loop negative feedback control at every level of the hierarchy. This feature replicates the purposive nature of animal behavior, which is often mistakenly assumed to be a feedforward process whereby a stimulus input is transformed by the nervous system and results in motor output (51).…”
Section: Discussionmentioning
confidence: 99%
“…The ‘nervous system’ of the robot operates through a hierarchical network of simple control system modules. Unlike other robot control architectures that perform model-based control and planning (11, 12, 13, 14, 22, 23, 25, 33, 45, 50), our control architecture generates robust and adaptive goal-directed behavior through a simple feedback process requiring no model of the environment, prediction of future states, or learning. Unlike architectures in which behavior is generated by environmental stimuli or internal system dynamics (1, 2, 3, 4, 18, 24, 28, 29, 32, 36, 35, 37, 43, 46), our architecture generates adaptive behavior by automatically achieving continuously changing internal goals in the control hierarchy.…”
Section: Introductionmentioning
confidence: 99%
“…The two-layer CPG network is chosen as the locomotion generator instead of learning joint position commands directly like most of the other studies (Hwangbo et al, 2019 ; Tsounis et al, 2020 ). There are three reasons for this: (1) the CPG network constrains the basic locomotion of the robot, which reduces the search space and accelerates the learning; (2) compared to 18 joint position or joint torque commands, learning symmetric CPG coupling parameters lowers the dimension of the action space; (3) the CPG network outputs smooth joint position commands, which are easier to be realized in the real robot.…”
Section: Locomotion Optimization Via Reinforcement Learningmentioning
confidence: 99%