We have developed learning and interaction algorithms to support a human teaching hierarchical task models to a robot using a single demonstration in the context of a mixedinitiative interaction with bi-directional communication. In particular, we have identified and implemented two important heuristics for suggesting task groupings based on the physical structure of the manipulated artifact and on the data flow between tasks. We have evaluated our algorithms with users in a simulated environment and shown both that the overall approach is usable and that the grouping suggestions significantly improve the learning and interaction.
This work seeks to leverage semantic networks containing millions of entries encoding assertions of commonsense knowledge to enable improvements in robot task execution and learning. The specific application we explore in this project is object substitution in the context of task adaptation. Humans easily adapt their plans to compensate for missing items in day-to-day tasks, substituting a wrap for bread when making a sandwich, or stirring pasta with a fork when out of spoons. Robot plan execution, however, is far less robust, with missing objects typically leading to failure if the robot is not aware of alternatives. In this article, we contribute a context-aware algorithm that leverages the linguistic information embedded in the task description to identify candidate substitution objects without reliance on explicit object affordance information. Specifically, we show that the task context provided by the task labels within the action structure of a task plan can be leveraged to disambiguate information within a noisy large-scale semantic network containing hundreds of potential object candidates to identify successful object substitutions with high accuracy. We present two extensive evaluations of our work on both abstract and real-world robot tasks, showing that the substitutions made by our system are valid, accepted by users, and lead to a statistically significant reduction in robot learning time. In addition, we report the outcomes of testing our approach with a large number of crowd workers interacting with a robot in real time.
In a multi-agent setting, the optimal policy of a single agent is largely dependent on the behavior of other agents. We investigate the problem of multi-agent reinforcement learning, focusing on decentralized learning in non-stationary domains for mobile robot navigation. We identify a cause for the difficulty in training non-stationary policies: mutual adaptation to sub-optimal behaviors, and we use this to motivate a curriculum-based strategy for learning interactive policies. The curriculum has two stages. First, the agent leverages policy gradient algorithms to learn a policy that is capable of achieving multiple goals. Second, the agent learns a modifier policy to learn how to interact with other agents in a multi-agent setting. We evaluated our approach on both an autonomous driving lane-change domain and a robot navigation domain.
Current approaches to learning partial ordering constraints by demonstration require demonstrating all (or almost all) possible completion orders. We have developed an algorithm that, for plans involving relative placement of objects, learns the partial ordering constraints from a single demonstration by letting the user specify naturally conceived reference frame information. This work is an example of a broader research agenda that involves applying principles of human collaboration to robot learning from demonstration.
While mobile robots reliably perform each service task by accurately localizing and safely navigating avoiding obstacles, they do not respond in any other way to their surroundings. We can make the robots more responsive to their environment by equipping them with models of multiple tasks and a way to interrupt a specific task and switch to another task based on observations. However the challenges of a multiple task model approach include selecting a task model to execute based on observations and having a potentially large set of observations associated with the set of all individual task models. We present a novel two-step solution. First, our approach leverages the tasks' policies and an abstract representation of their states, and learns which task should be executed at each given world state. Secondly, the algorithm uses the learned tasks and identifies the observation stimuli that trigger the interruption of one task and the switch to another task. We show that our solution using the switching stimuli compares favorably to the naive approach of learning a combined model for all the tasks. Moreover, leveraging the stimuli significantly decreases the amount of sensory input processing during the execution of tasks.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.