A synthesis of automated planning and reinforcement learning for efficient, robust decision-making

Leonetti, Matteo; Iocchi, Luca; Stone, Peter

doi:10.1016/j.artint.2016.07.004

Cited by 85 publications

(61 citation statements)

References 21 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In [9], Leonetti et al investigated a low level integration of RL and external controllers where the RL algorithm only explores with feasible actions provided by the planner, these heuristics can not be discarded, both for training and testing. Therefore, the performance of the learner very depends on, if not limited by, the capability of the heuristics.…”

Section: Related Workmentioning

confidence: 99%

Learning With Stochastic Guidance for Robot Navigation

Xie

Miao

Wang

et al. 2021

IEEE Trans. Neural Netw. Learning Syst.

View full text Add to dashboard Cite

Due to the sparse rewards and high degree of environment variation, reinforcement learning approaches such as Deep Deterministic Policy Gradient (DDPG) are plagued by issues of high variance when applied in complex real world environments. We present a new framework for overcoming these issues by incorporating a stochastic switch, allowing an agent to choose between high and low variance policies. The stochastic switch can be jointly trained with the original DDPG in the same framework. In this paper, we demonstrate the power of the framework in a navigation task, where the robot can dynamically choose to learn through exploration, or to use the output of a heuristic controller as guidance. Instead of starting from completely random moves, the navigation capability of a robot can be quickly bootstrapped by several simple independent controllers. The experimental results show that with the aid of stochastic guidance we are able to effectively and efficiently train DDPG navigation policies and achieve significantly better performance than state-of-the-art baselines models.

show abstract

Section: Related Workmentioning

confidence: 99%

Learning With Stochastic Guidance for Robot Navigation

Xie

Miao

Wang

et al. 2021

IEEE Trans. Neural Netw. Learning Syst.

View full text Add to dashboard Cite

show abstract

“…The integration of symbolic planning with reinforcement learning has been studied in a variety of approaches [12,22,29,31]. These methods focus on leveraging the strengths of one of the paradigms to enhance the other.…”

Section: Related Workmentioning

confidence: 99%

Task-Motion Planning with Reinforcement Learning for Adaptable Mobile Service Robots

Jiang

Yang

Zhang

et al. 2019

2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

Self Cite

View full text Add to dashboard Cite

Task-motion planning (TMP) addresses the problem of efficiently generating executable and low-cost task plans in a discrete space such that the (initially unknown) action costs are determined by motion plans in a corresponding continuous space. However, a taskmotion plan can be sensitive to unexpected domain uncertainty and changes, leading to suboptimal behaviors or execution failures. In this paper, we propose a novel framework, TMP-RL, which is an integration of TMP and reinforcement learning (RL) from the execution experience, to solve the problem of robust task-motion planning in dynamic and uncertain domains. TMP-RL features two nested planning-learning loops. In the inner TMP loop, the robot generates a low-cost, feasible task-motion plan by iteratively planning in the discrete space and updating relevant action costs evaluated by the motion planner in continuous space. In the outer loop, the plan is executed, and the robot learns from the execution experience via model-free RL, to further improve its task-motion plans. RL in the outer loop is more accurate to the current domain but also more expensive, and using less costly task and motion planning leads to a jump-start for learning in the real world. Our approach is evaluated on a mobile service robot conducting navigation tasks in an office area. Results show that TMP-RL approach significantly improves adaptability and robustness (in comparison to TMP methods) and leads to rapid convergence (in comparison to task planning (TP)-RL methods). We also show that TMP-RL can reuse learned values to smoothly adapt to new scenarios during long-term deployments.

show abstract

“…4. (lines 8,13,14) Suppose that the condition on line 8 is false. We then do not create a new state.…”

Section: (Line 6-7)mentioning

confidence: 99%

Learning Abstract Planning Domains and Mappings to Real World Perceptions

Serafini

Traverso

2019

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Most of the works on planning and learning, e.g., planning by (model based) reinforcement learning, are based on two main assumptions: (i) the set of states of the planning domain is fixed; (ii) the mapping between the observations from the real word and the states is implicitly assumed or learned offline, and it is not part of the planning domain. Consequently, the focus is on learning the transitions between states. In this paper, we drop such assumptions. We provide a formal framework in which (i) the agent can learn dynamically new states of the planning domain; (ii) the mapping between abstract states and the perception from the real world, represented by continuous variables, is part of the planning domain; (iii) such mapping is learned and updated along the "life" of the agent. We define an algorithm that interleaves planning, acting, and learning, and allows the agent to update the planning domain depending on how much it trusts the model w.r.t. the new experiences learned by executing actions. We define a measure of coherence between the planning domain and the real world as perceived by the agent. We test our approach showing that the agent learns increasingly coherent models, and that the system can scale to deal with models with an order of 10 6 states.

show abstract

A synthesis of automated planning and reinforcement learning for efficient, robust decision-making

Cited by 85 publications

References 21 publications

Learning With Stochastic Guidance for Robot Navigation

Learning With Stochastic Guidance for Robot Navigation

Task-Motion Planning with Reinforcement Learning for Adaptable Mobile Service Robots

Learning Abstract Planning Domains and Mappings to Real World Perceptions

Contact Info

Product

Resources

About