Reinforcement Learning in Robotics: A Survey

Kober, Jens; Peters, Jan

doi:10.1007/978-3-642-27645-3_18

Cited by 265 publications

(274 citation statements)

References 80 publications

Supporting

Mentioning

265

Contrasting

Order By: Relevance

“…A sequence of raw frames is used as input to the network and trial and error is used to learn a policy. Trial and error methods such as reinforcement learning have been extensively used to learn policies for intelligent agents [17]. However, providing demonstrations of correct behavior can greatly expedite the learning rate.…”

Section: Deep Learningmentioning

confidence: 99%

Deep Active Learning for Autonomous Navigation

Hussein

Gaber

Elyan

2016

Communications in Computer and Information Science

View full text Add to dashboard Cite

Section: Deep Learningmentioning

confidence: 99%

Deep Active Learning for Autonomous Navigation

Hussein

Gaber

Elyan

2016

Communications in Computer and Information Science

View full text Add to dashboard Cite

“…Reinforcement learning is a promising approach to deal with control of physical robot with ever increasing complexity of hardware [22], [23] through experience and observations. Q-learning algorithm is a popular model-free reinforcement learning that have been demonstrated to give good results for some instances of robot tasks over the years.…”

Section: Q-learning Algorithmmentioning

confidence: 99%

“…It has been widely applied to the design of robot speed and orientation steering controller because of the following reasons: 1) Control rules are more flexible, thus it can simplify the complex system; 2) The controller can emulate the human decision making; 3) It does not need a detailed model of the plant, and it replaces the mathematical values in describing control system by using the linguistic ambiguous labels for designing robust controllers. On the other hand, reinforcement learning, in particular Q-learning, shows good learning results in designing control input for performing constrained tasks by robots without knowing the system dynamics [22], [23]. The approaches of combining type-1 fuzzy logic and Q-learning for optimization of the consequence parts of fuzzy rules are promising due to the ease of implementation on mobile robot navigation [12]- [17] in which Q value is a cost for each navigation behavior.…”

Section: Introductionmentioning

confidence: 99%

An Intelligent Control System for Mobile Robot Navigation Tasks in Surveillance

Lin

et al. 2014

Robot Intelligence Technology and Applications 2

View full text Add to dashboard Cite

Abstract. In recent years, the autonomous mobile robot has found diverse applications such as home/health care system, surveillance system in civil and military applications and exhibition robot. For surveillance tasks such as moving target pursuit or following and patrol in a region using mobile robot, this paper presents a fuzzy Q-learning, as an intelligent control for cost-based navigation, for autonomous learning of suitable behaviors without the supervision or external human command. The Q-learning is used to select the appropriate rule of interval type-2 fuzzy rule base. The initial testing of the intelligent control is demonstrated by simulation as well as experiment of a simple wall-following based patrolling task of autonomous mobile robot.

show abstract

“…Many HRL methods have been proposed in order to reduce the complexity of the task [1]- [4]. The HAM framework [5] can learn complex hierarchical sub-routines.…”

Section: Introductionmentioning

confidence: 99%

Layered direct policy search for learning hierarchical skills

End

Akrour

Peters

et al. 2017

2017 IEEE International Conference on Robotics and Automation (ICRA)

Self Cite

View full text Add to dashboard Cite

Abstract-Solutions to real world robotic tasks often require complex behaviors in high dimensional continuous state and action spaces. Reinforcement Learning (RL) is aimed at learning such behaviors but often fails for lack of scalability. To address this issue, Hierarchical RL (HRL) algorithms leverage hierarchical policies to exploit the structure of a task. However, many HRL algorithms rely on task specific knowledge such as a set of predefined sub-policies or sub-goals. In this paper we propose a new HRL algorithm based on information theoretic principles to autonomously uncover a diverse set of sub-policies and their activation policies. Moreover, the learning process mirrors the policys structure and is thus also hierarchical, consisting of a set of independent optimization problems. The hierarchical structure of the learning process allows us to control the learning rate of the sub-policies and the gating individually and add specific information theoretic constraints to each layer to ensure the diversification of the subpolicies. We evaluate our algorithm on two high dimensional continuous tasks and experimentally demonstrate its ability to autonomously discover a rich set of sub-policies.

show abstract

Reinforcement Learning in Robotics: A Survey

Cited by 265 publications

References 80 publications

Deep Active Learning for Autonomous Navigation

Deep Active Learning for Autonomous Navigation

An Intelligent Control System for Mobile Robot Navigation Tasks in Surveillance

Layered direct policy search for learning hierarchical skills

Contact Info

Product

Resources

About