Multi-object rearrangement is a crucial skill for service robots, and commonsense reasoning is frequently needed in this process. However, achieving commonsense arrangements requires knowledge about objects, which is hard to transfer to robots. Large language models (LLMs) are one potential source of this knowledge, but they do not naively capture information about plausible physical arrangements of the world. We propose LLM-GROP, which uses prompting to extract commonsense knowledge about semantically valid object configurations from an LLM and instantiates them with a task and motion planner in order to generalize to varying scene geometry. LLM-GROP allows us to go from natural-language commands to humanaligned object rearrangement in varied environments. Based on human evaluations, our approach achieves the highest rating while outperforming competitive baselines in terms of success rate while maintaining comparable cumulative action costs. Finally, we demonstrate a practical implementation of LLM-GROP on a mobile manipulator in real-world scenarios.
Autonomous vehicles need to plan at the task level to compute a sequence of symbolic actions, such as merging left and turning right, to fulfill people's service requests, where efficiency is the main concern. At the same time, the vehicles must compute continuous trajectories to perform actions at the motion level, where safety is the most important. Taskmotion planning in autonomous driving faces the problem of maximizing task-level efficiency while ensuring motion-level safety. To this end, we develop algorithm Task-Motion Planning for Urban Driving (TMPUD) that, for the first time, enables the task and motion planners to communicate about the safety level of driving behaviors. TMPUD has been evaluated using a realistic urban driving simulation platform. Results suggest that TMPUD performs significantly better than competitive baselines from the literature in efficiency, while ensuring the safety of driving behaviors.
Task and motion planning (TAMP) algorithms have been developed to help robots plan behaviors in discrete and continuous spaces. Robots face complex real-world scenarios, where it is hardly possible to model all objects or their physical properties for robot planning (e.g., in kitchens or shopping centers). In this paper, we define a new object-centric TAMP problem, where the TAMP robot does not know object properties (e.g., size and weight of blocks). We then introduce Task-Motion Object-Centric planning (TMOC), a grounded TAMP algorithm that learns to ground objects and their physical properties with a physics engine. TMOC is particularly useful for those tasks that involve dynamic complex robot-multi-object interactions that can hardly be modeled beforehand. We have demonstrated and evaluated TMOC in simulation and using a real robot. Results show that TMOC outperforms competitive baselines from the literature in cumulative utility.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.