Ivan Koryakovskiy scite author profile

Ivan Koryakovskiy

5Publications

46Citation Statements Received

70Citation Statements Given

How they've been cited

How they cite others

Affiliations

Delft University of Technology, Seoul National University

Publications

Order By: Most citations

Model-Plant Mismatch Compensation Using Reinforcement Learning

Koryakovskiy

Kudruss

Babuška

et al. 2018

IEEE Robot. Autom. Lett.

View full text Add to dashboard Cite

Abstract-Learning-based approaches are suitable for the control of systems with unknown dynamics. However, learning from scratch involves many trials with exploratory actions until a good control policy is discovered. Real robots usually cannot withstand the exploratory actions and suffer damage. This problem can be circumvented by combining learning with model-based control. In this article, we employ a nominal model-predictive controller that is impeded by the presence of an unknown modelplant mismatch. To compensate for the mismatch, we propose two approaches of combining reinforcement learning with the nominal controller. The first approach learns a compensatory control action which minimizes the same performance measure as is minimized by the nominal controller. The second approach learns a compensatory signal from a difference of a transition predicted by the internal model and an actual transition. We compare the approaches on a robot attached to the ground and performing a setpoint reaching task in simulations. We implement the better approach on the real robot and demonstrate successful learning results.

show abstract

Benchmarking model-free and model-based optimal control

Koryakovskiy

Kudruss

Babuka

et al. 2017

Robotics and Autonomous Systems

View full text Add to dashboard Cite

Model-free reinforcement learning and nonlinear model predictive control are two different approaches for controlling a dynamic system in an optimal way according to a prescribed cost function. Reinforcement learning acquires a control policy through exploratory interaction with the system, while nonlinear model predictive control exploits an explicitly given mathematical model of the system. In this article, we provide a comprehensive comparison of the performance of reinforcement learning and nonlinear model predictive control for an ideal system as well as for a system with parametric and structural uncertainties. The comparison is based on two different criteria, namely the similarity of trajectories and the resulting rewards. The evaluation of both methods is performed on a standard benchmark problem: a cart-pendulum swing-up and balance task. We first find suitable mathematical formulations and discuss the effect of the differences in the problem formulations. Then, we investigate the robustness of reinforcement learning and nonlinear model predictive control against uncertainties. The results demonstrate that nonlinear model predictive control has advantages over reinforcement learning if uncertainties can be eliminated through identification of the system parameters. Otherwise, there exists a break-even point after which model-free reinforcement learning performs better than nonlinear model predictive control with an inaccurate model. These findings suggest that benefits can be obtained by combining these methods for real systems being subject to such uncertainties. In the future, we plan to develop a hybrid controller and evaluate its performance on a real seven-degree-of-freedom walking robot.

show abstract

Reinforcement Learning of Potential Fields to achieve Limit-Cycle Walking

2016

View full text Add to dashboard Cite

A genetic algorithm with local map for path planning in dynamic environments

Koryakovskiy

Hoài

Lee

2009

View full text Add to dashboard Cite

Evaluation of physical damage associated with action selection strategies in reinforcement learning

2017

View full text Add to dashboard Cite

Reinforcement learning techniques enable robots to deal with their own dynamics and with unknown environments without using explicit models or preprogrammed behaviors. However, reinforcement learning relies on intrinsically risky exploration, which is often damaging for physical systems. In the case of the bipedal walking robot Leo, which is studied in this paper, two sources of damage can be identified: fatigue of gearboxes due to backlash re-engagements, and the overall system damage due to falls of the robot. We investigate several exploration techniques and compare them in terms of gearbox fatigue, cumulative number of falls and undiscounted return. The results show that exploration with the Ornstein-Uhlenbeck (OU) process noise leads to the highest return, but at the same time it causes the largest number of falls. The Previous Action-Dependent Action (PADA) method results in drastically reduced fatigue, but also a large number of falls. The results reveal a previously unknown trade-off between the two sources of damage. Inspired by the OU and PADA methods, we propose four new action-selection methods in a systematic way. One of the proposed methods with a time-correlated noise outperforms the well-known -greedy method in all three benchmarks. We provide guidance towards the choice of exploration strategy for reinforcement learning applications on real physical systems.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.