2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2018
DOI: 10.1109/iros.2018.8593894
|View full text |Cite
|
Sign up to set email alerts
|

Setting up a Reinforcement Learning Task with a Real-World Robot

Abstract: Through many recent successes in simulation, model-free reinforcement learning has emerged as a promising approach to solving continuous control robotic tasks. The research community is now able to reproduce, analyze and build quickly on these results due to open source implementations of learning algorithms and simulated benchmark tasks. To carry forward these successes to real-world applications, it is crucial to withhold utilizing the unique advantages of simulations that do not transfer to the real world a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

1
80
0
2

Year Published

2019
2019
2022
2022

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 76 publications
(86 citation statements)
references
References 25 publications
1
80
0
2
Order By: Relevance
“…In particular, we extend the framework of maximum entropy RL. Methods of this type, such as soft actorcritic [17] and soft Q-learning [15], can achieve state-of-the-art sample efficiency [17] and have been successfully deployed in real-world manipulation tasks [16,31], where they exhibit a high degree of robustness due to entropy maximization [16]. However, maximum entropy RL algorithms are sensitive to the choice of the temperature parameter, which determines the trade-off between exploration (maximizing the entropy) and exploitation (maximizing the reward).…”
Section: Introductionmentioning
confidence: 99%
“…In particular, we extend the framework of maximum entropy RL. Methods of this type, such as soft actorcritic [17] and soft Q-learning [15], can achieve state-of-the-art sample efficiency [17] and have been successfully deployed in real-world manipulation tasks [16,31], where they exhibit a high degree of robustness due to entropy maximization [16]. However, maximum entropy RL algorithms are sensitive to the choice of the temperature parameter, which determines the trade-off between exploration (maximizing the entropy) and exploitation (maximizing the reward).…”
Section: Introductionmentioning
confidence: 99%
“…Figure 3 shows a visual representation of the RL algorithm classification and the mathematical description of a few RL algorithms shown in Figure 3 has been described in the appendix section. In comparison model-based RL with model-free RL, model-free RL has proved itself as a promising approach in the field of robotics [31]. Table 3 presents a quick view on the various RL algorithms used in the development of robots in different studies.…”
Section: A Learning Approachesmentioning
confidence: 99%
“…Diese Datenmengen lassen sich mit dem Training auf realen Systemen -also ohne die Verwendung von Simulationen -nur schwer erzeugen. Limitierende Faktoren sind hierbei insbesondere die Voraussetzung einer großen Menge an parallelen Aufbauten, der hohe Zeitaufwand und der Verschleiß der Hardware (Irpan 2020;Hessel et al 2017;Kober et al 2013;Mahmood et al 2018). Aus diesem Grund sind hochqualitative Simulationen zum Training der Algorithmen erforderlich.…”
Section: Inhärente Technische Hürdenunclassified
“…Darüber hinaus steigert diese Art der Initialisierung die benötigten Datenmengen, da der Algorithmus alles von Grund auf lernen muss. Leider ist es insbesondere bei den häufig verwendeten modellfreien Algorithmen nicht möglich, effektiv Vorwissen einzubringen oder bereits gelerntes Wissen auf ähnliche Probleme zu übertragen, was diesem Problem entgegenwirken könnte (Polydoros und Nalpantidis 2017;Mahmood et al 2018;Schulman et al 2015).…”
Section: Inhärente Technische Hürdenunclassified