Sim-to-(Multi)-Real: Transfer of Low-Level Robust Control Policies to Multiple Quadrotors

Molchanov, Artem; Chen, Tao; Hönig, Wolfgang; Preiss, James A.; Ayanian, Nora; Sukhatme, Gaurav S.

doi:10.1109/iros40897.2019.8967695

Cited by 77 publications

(56 citation statements)

References 17 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Recent advancements in the fields of Iterative Learning Control (ILC), Reinforcement Learning (RL) and Deep Learning (DL), and the growth of computational capabilities have given rise to new approaches of controller design and tuning [28], [29], [30], [31], [32], [33], [34]. These approaches have introduced advantages in regards to accuracy of models and controllers, adaptation time, and the ability to handle nonlinearities in the PUT; with the limiting requirement of abundant observation data.…”

mentioning

confidence: 99%

Real-Time System Identification Using Deep Learning for Linear Processes With Application to Unmanned Aerial Vehicles

et al. 2020

View full text Add to dashboard Cite

System identification is a key discipline within the field of automation that deals with inferring mathematical models of dynamic systems based on input-output measurements. Conventional identification methods require extensive data generation and are thus not suitable for real-time applications. In this paper, a novel real-time approach for the parametric identification of linear systems using Deep Learning (DL) and the Modified Relay Feedback Test (MRFT) is proposed. The proposed approach requires only a single steady-state cycle of MRFT, and guarantees stability and performance in the identification and control phases. The MRFT output is passed to a trained DL model that identifies the underlying process parameters in milliseconds. A novel modification to the Softmax function is derived to better conform the DL model for the process identification task. Quadrotor Unmanned Aerial Vehicle (UAV) attitude and altitude dynamics were used in simulation and experimentation to verify the presented approach. Results show the effectiveness and real-time capabilities of the proposed approach, which outperforms the conventional Prediction Error Method in terms of accuracy, robustness to biases, computational efficiency and data requirements.

show abstract

mentioning

confidence: 99%

Real-Time System Identification Using Deep Learning for Linear Processes With Application to Unmanned Aerial Vehicles

et al. 2020

View full text Add to dashboard Cite

show abstract

“…The method’s adaptability for multirotor UAVs is demonstrated. By contrast, the trained RL controllers presented in [ 17 , 18 , 21 ] can only be used for a specific multirotor with the same physical structure and parameters. In our method, the policy neural network output can be converted for each actuator unit according to the dynamic model of various geometric characteristics of the vehicle.…”

Section: Discussionmentioning

confidence: 99%

“…To resolve this problem, Wang et al [ 20 ] used a deterministic policy gradient algorithm with integral state to compensate for the varying weight of the quadrotor. Molchanov et al [ 21 ] proposed a method through adjusting the output gain according to the weight of the quadrotor and implemented the method for three quadrotors. The quadrotors remained stable even when their size and weight were different under certain conditions.…”

Section: Introductionmentioning

confidence: 99%

General Purpose Low-Level Reinforcement Learning Control for Multi-Axis Rotor Aerial Vehicles

Dai

et al. 2021

Sensors

View full text Add to dashboard Cite

This paper proposes a multipurpose reinforcement learning based low-level multirotor unmanned aerial vehicles control structure constructed using neural networks with model-free training. Other low-level reinforcement learning controllers developed in studies have only been applicable to a model-specific and physical-parameter-specific multirotor, and time-consuming training is required when switching to a different vehicle. We use a 6-degree-of-freedom dynamic model combining acceleration-based control from the policy neural network to overcome these problems. The UAV automatically learns the maneuver by an end-to-end neural network from fusion states to acceleration command. The state estimation is performed using the data from on-board sensors and motion capture. The motion capture system provides spatial position information and a multisensory fusion framework fuses the measurement from the onboard inertia measurement units for compensating the time delay and low update frequency of the capture system. Without requiring expert demonstration, the trained control policy implemented using an improved algorithm can be applied to various multirotors with the output directly mapped to actuators. The algorithm’s ability to control multirotors in the hovering and the tracking task is evaluated. Through simulation and actual experiments, we demonstrate the flight control with a quadrotor and hexrotor by using the trained policy. With the same policy, we verify that we can stabilize the quadrotor and hexrotor in the air under random initial states.

show abstract

“…The key idea is that if a learned policy can work in different simulations then it is more likely to be able to perform well in the real world. The simplest instantiation of this idea is to inject noise into the robot's actions or sensors (Jakobi et al, 1995;Miglino et al, 1996) or to randomize the simulator parameters (Peng et al, 2017;Molchanov et al, 2019;Ope-nAI et al, 2018). Unlike data driven approaches, such domain randomization approaches learn policies that are robust enough to cross the reality gap but may give up some ability to exploit the target real world environment.…”

Section: Robustness Through Simulator Variancementioning

confidence: 99%

Grounded action transformation for sim-to-real reinforcement learning

et al. 2021

View full text Add to dashboard Cite

Reinforcement learning in simulation is a promising alternative to the prohibitive sample cost of reinforcement learning in the physical world. Unfortunately, policies learned in simulation often perform worse than hand-coded policies when applied on the target, physical system. Grounded simulation learning (gsl) is a general framework that promises to address this issue by altering the simulator to better match the real world (Farchy et al. 2013 in Proceedings of the 12th international conference on autonomous agents and multiagent systems (AAMAS)). This article introduces a new algorithm for gsl—Grounded Action Transformation (GAT)—and applies it to learning control policies for a humanoid robot. We evaluate our algorithm in controlled experiments where we show it to allow policies learned in simulation to transfer to the real world. We then apply our algorithm to learning a fast bipedal walk on a humanoid robot and demonstrate a 43.27% improvement in forward walk velocity compared to a state-of-the art hand-coded walk. This striking empirical success notwithstanding, further empirical analysis shows that gat may struggle when the real world has stochastic state transitions. To address this limitation we generalize gat to the stochasticgat (sgat) algorithm and empirically show that sgat leads to successful real world transfer in situations where gat may fail to find a good policy. Our results contribute to a deeper understanding of grounded simulation learning and demonstrate its effectiveness for applying reinforcement learning to learn robot control policies entirely in simulation.

show abstract

Sim-to-(Multi)-Real: Transfer of Low-Level Robust Control Policies to Multiple Quadrotors

Cited by 77 publications

References 17 publications

Real-Time System Identification Using Deep Learning for Linear Processes With Application to Unmanned Aerial Vehicles

Real-Time System Identification Using Deep Learning for Linear Processes With Application to Unmanned Aerial Vehicles

General Purpose Low-Level Reinforcement Learning Control for Multi-Axis Rotor Aerial Vehicles

Grounded action transformation for sim-to-real reinforcement learning

Contact Info

Product

Resources

About