Deep Visual Reasoning: Learning to Predict Action Sequences for Task and Motion Planning from an Initial Scene Image

Driess, Danny; Ha, Jung-Su; Toussaint, Marc

doi:10.15607/rss.2020.xvi.003

Cited by 63 publications

(40 citation statements)

References 32 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…However, their search is over plan refinements, rather than abstract actions and they do not address the state representation problem. Driess et al (2020a) proposed an approach that directly predicts a task plan from an initial image of the scene, based on which a motion-level planning was performed to find a motion plan that satisfies the predicted task plan. Our method differs in that (1) we assume we know the poses and shapes of objects, and (2) we provide guidance both at the task and motion levels based on a representation that can reason about occlusion, reachability, and collisions.…”

Section: Learning To Guide Planningmentioning

confidence: 99%

Representation, learning, and planning algorithms for geometric task and motion planning

Kim

Shimanuki²,

Kaelbling

et al. 2021

The International Journal of Robotics Research

View full text Add to dashboard Cite

We present a framework for learning to guide geometric task-and-motion planning (G-TAMP). G-TAMP is a subclass of task-and-motion planning in which the goal is to move multiple objects to target regions among movable obstacles. A standard graph search algorithm is not directly applicable, because G-TAMP problems involve hybrid search spaces and expensive action feasibility checks. To handle this, we introduce a novel planner that extends basic heuristic search with random sampling and a heuristic function that prioritizes feasibility checking on promising state–action pairs. The main drawback of such pure planners is that they lack the ability to learn from planning experience to improve their efficiency. We propose two learning algorithms to address this. The first is an algorithm for learning a rank function that guides the discrete task-level search, and the second is an algorithm for learning a sampler that guides the continuous motion-level search. We propose design principles for designing data-efficient algorithms for learning from planning experience and representations for effective generalization. We evaluate our framework in challenging G-TAMP problems, and show that we can improve both planning and data efficiency.

show abstract

Section: Learning To Guide Planningmentioning

confidence: 99%

Representation, learning, and planning algorithms for geometric task and motion planning

Kim

Shimanuki²,

Kaelbling

et al. 2021

The International Journal of Robotics Research

View full text Add to dashboard Cite

show abstract

“…There are several orthogonal avenues of research under this umbrella: methods which learn capabilities that may be difficult to engineer (e.g. a pouring action) [14], those which learn the symbolic representations with which to plan [15] [16], those that integrate perception learning and scene understanding into TAMP [17], [18], and those which attempt to learn search guidance from experience [19] [20] [21]. Similar in spirit to our work, [19] tries to guide the search for action skeletons by learning to predict the credibility of a sequence of discrete actions directly from visual observations of the scene, using these predictions as a heuristic in a best-first search for action sequences.…”

Section: Related Work Integrated Task and Motion Planningmentioning

confidence: 99%

“…In contrast, we seek to learn an efficient heuristic to guide the construction of an optimistic planning problem with a minimal set of irrelevant facts. We leverage an off-the-shelf domain-independent search sub-routine, thus our work can be synergistically combined with methods like [19] [20] [21] that learn a domain-specific heuristic.…”

Section: Related Work Integrated Task and Motion Planningmentioning

confidence: 99%

Learning to Search in Task and Motion Planning with Streams

Khodeir¹,

Agro²,

Shkurti³

2021

Preprint

View full text Add to dashboard Cite

Task and motion planning problems in robotics typically combine symbolic planning over discrete task variables with motion optimization over continuous state and action variables, resulting in trajectories that satisfy the logical constraints imposed on the task variables. Symbolic planning can scale exponentially with the number of task variables, so recent works such as PDDLStream [1] have focused on optimistic planning with an incrementally growing set of objects and facts until a feasible trajectory is found. However, this set is exhaustively and uniformly expanded in a breadth-first manner, regardless of the geometric structure of the problem at hand, which makes long-horizon reasoning with large numbers of objects prohibitively time-consuming. To address this issue, we propose a geometrically informed symbolic planner that expands the set of objects and facts in a best-first manner, prioritized by a Graph Neural Network based score that is learned from prior search computations. We evaluate our approach on a diverse set of problems and demonstrate an improved ability to plan in large or difficult scenarios. We also apply our algorithm on a 7DOF robotic arm in several block-stacking manipulation tasks.

show abstract

“…This is especially relevant for the alignment term φ align , which has multiple local minima. Instead of explicitly enumerating possible different alignments as in [41], [42], we diversify grasp candidate computation by introducing a random alignment term, effectively randomizing the approach axis of a grasp candidate. That is, at initialization we compute a random orthonomal basis…”

Section: A Grasp Configuration Observermentioning

confidence: 99%

Deep 6-DoF Tracking of Unknown Objects for Reactive Grasping

Tuscher¹,

Hörz²,

Driess³

et al. 2021

Preprint

Self Cite

View full text Add to dashboard Cite

Robotic manipulation of unknown objects is an important field of research. Practical applications occur in many real-world settings where robots need to interact with an unknown environment. We tackle the problem of reactive grasping by proposing a method for unknown object tracking, grasp point sampling and dynamic trajectory planning. Our object tracking method combines Siamese Networks with an Iterative Closest Point approach for pointcloud registration into a method for 6-DoF unknown object tracking. The method does not require further training and is robust to noise and occlusion. We propose a robotic manipulation system, which is able to grasp a wide variety of formerly unseen objects and is robust against object perturbations and inferior grasping points.1 https://youtu.be/Hew00rMw8qg

show abstract

Deep Visual Reasoning: Learning to Predict Action Sequences for Task and Motion Planning from an Initial Scene Image

Cited by 63 publications

References 32 publications

Representation, learning, and planning algorithms for geometric task and motion planning

Representation, learning, and planning algorithms for geometric task and motion planning

Learning to Search in Task and Motion Planning with Streams

Deep 6-DoF Tracking of Unknown Objects for Reactive Grasping

Contact Info

Product

Resources

About