2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2019
DOI: 10.1109/iros40897.2019.8967695
|View full text |Cite
|
Sign up to set email alerts
|

Sim-to-(Multi)-Real: Transfer of Low-Level Robust Control Policies to Multiple Quadrotors

Abstract: Quadrotor stabilizing controllers often require careful, model-specific tuning for safe operation. We use reinforcement learning to train policies in simulation that transfer remarkably well to multiple different physical quadrotors. Our policies are low-level, i.e., we map the rotorcrafts' state directly to the motor outputs. The trained control policies are very robust to external disturbances and can withstand harsh initial conditions such as throws. We show how different training methodologies (change of t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
55
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
6
1
1

Relationship

1
7

Authors

Journals

citations
Cited by 77 publications
(56 citation statements)
references
References 17 publications
1
55
0
Order By: Relevance
“…Recent advancements in the fields of Iterative Learning Control (ILC), Reinforcement Learning (RL) and Deep Learning (DL), and the growth of computational capabilities have given rise to new approaches of controller design and tuning [28], [29], [30], [31], [32], [33], [34]. These approaches have introduced advantages in regards to accuracy of models and controllers, adaptation time, and the ability to handle nonlinearities in the PUT; with the limiting requirement of abundant observation data.…”
mentioning
confidence: 99%
“…Recent advancements in the fields of Iterative Learning Control (ILC), Reinforcement Learning (RL) and Deep Learning (DL), and the growth of computational capabilities have given rise to new approaches of controller design and tuning [28], [29], [30], [31], [32], [33], [34]. These approaches have introduced advantages in regards to accuracy of models and controllers, adaptation time, and the ability to handle nonlinearities in the PUT; with the limiting requirement of abundant observation data.…”
mentioning
confidence: 99%
“…The method’s adaptability for multirotor UAVs is demonstrated. By contrast, the trained RL controllers presented in [ 17 , 18 , 21 ] can only be used for a specific multirotor with the same physical structure and parameters. In our method, the policy neural network output can be converted for each actuator unit according to the dynamic model of various geometric characteristics of the vehicle.…”
Section: Discussionmentioning
confidence: 99%
“…To resolve this problem, Wang et al [ 20 ] used a deterministic policy gradient algorithm with integral state to compensate for the varying weight of the quadrotor. Molchanov et al [ 21 ] proposed a method through adjusting the output gain according to the weight of the quadrotor and implemented the method for three quadrotors. The quadrotors remained stable even when their size and weight were different under certain conditions.…”
Section: Introductionmentioning
confidence: 99%
“…The key idea is that if a learned policy can work in different simulations then it is more likely to be able to perform well in the real world. The simplest instantiation of this idea is to inject noise into the robot's actions or sensors (Jakobi et al, 1995;Miglino et al, 1996) or to randomize the simulator parameters (Peng et al, 2017;Molchanov et al, 2019;Ope-nAI et al, 2018). Unlike data driven approaches, such domain randomization approaches learn policies that are robust enough to cross the reality gap but may give up some ability to exploit the target real world environment.…”
Section: Robustness Through Simulator Variancementioning
confidence: 99%