Language communication plays an important role in human learning and knowledge acquisition. With the emergence of a new generation of cognitive robots, empowering these robots to learn directly from human partners becomes increasingly important. This paper gives a brief introduction to interactive task learning where humans can teach physical agents new tasks through natural language communication and action demonstration. It discusses research challenges and opportunities in language and communication grounding that are critical in this process. It further highlights the importance of commonsense knowledge, particularly the very basic physical causality knowledge, in grounding language to perception and action.
This paper describes an approach for a robotic arm to learn new actions through dialogue in a simplified blocks world. In particular, we have developed a threetier action knowledge representation that on one hand, supports the connection between symbolic representations of language and continuous sensorimotor representations of the robot; and on the other hand, supports the application of existing planning algorithms to address novel situations. Our empirical studies have shown that, based on this representation the robot was able to learn and execute basic actions in the blocks world. When a human is engaged in a dialogue to teach the robot new actions, step-by-step instructions lead to better learning performance compared to one-shot instructions.
To enable human-robot communication and collaboration, previous works represent grounded verb semantics as the potential change of state to the physical world caused by these verbs. Grounded verb semantics are acquired mainly based on the parallel data of the use of a verb phrase and its corresponding sequences of primitive actions demonstrated by humans. The rich interaction between teachers and students that is considered important in learning new skills has not yet been explored. To address this limitation, this paper presents a new interactive learning approach that allows robots to proactively engage in interaction with human partners by asking good questions to learn models for grounded verb semantics. The proposed approach uses reinforcement learning to allow the robot to acquire an optimal policy for its question-asking behaviors by maximizing the long-term reward. Our empirical results have shown that the interactive learning approach leads to more reliable models for grounded verb semantics, especially in the noisy environment which is full of uncertainties. Compared to previous work, the models acquired from interactive learning result in a 48% to 145% performance gain when applied in new situations.
When humans and artificial agents (e.g. robots) have mismatched perceptions of the shared environment, referential communication between them becomes difficult. To mediate perceptual differences, this paper presents a new approach using probabilistic labeling for referential grounding. This approach aims to integrate different types of evidence from the collaborative referential discourse into a unified scheme. Its probabilistic labeling procedure can generate multiple grounding hypotheses to facilitate follow-up dialogue. Our empirical results have shown the probabilistic labeling approach significantly outperforms a previous graphmatching approach for referential grounding.
As a new generation of cognitive robots start to enter our lives, it is important to enable robots to follow human commands and to learn new actions from human language instructions. To address this issue, this paper presents an approach that explicitly represents verb semantics through hypothesis spaces of fluents and automatically acquires these hypothesis spaces by interacting with humans. The learned hypothesis spaces can be used to automatically plan for lower-level primitive actions towards physical world interaction. Our empirical results have shown that the representation of a hypothesis space of fluents, combined with the learned hypothesis selection algorithm, outperforms a previous baseline. In addition, our approach applies incremental learning, which can contribute to life-long learning from humans in the future.
Robots often have limited knowledge and need to continuously acquire new knowledge and skills in order to collaborate with its human partners. To address this issue, this paper describes an approach which allows human partners to teach a robot (i.e., a robotic arm) new high-level actions through natural language instructions. In particular, built upon the traditional planning framework, we propose a representation of high-level actions that only consists of the desired goal states rather than step-by-step operations (although these operations may be specified by the human in their instructions). Our empirical results have shown that, given this representation, the robot can reply on automated planning and immediately apply the newly learned action knowledge to perform actions under novel situations.
To enable situated human-robot dialogue, techniques to support grounded language communication are essential. One particular challenge is to ground human language to robot internal representation of the physical world. Although copresent in a shared environment, humans and robots have mismatched capabilities in reasoning, perception, and action. Their representations of the shared environment and joint tasks are significantly misaligned. Humans and robots will need to make extra effort to bridge the gap and strive for a common ground of the shared world. Only then, is the robot able to engage in language communication and joint tasks. Thus computational models for language grounding will need to take collaboration into consideration. A robot not only needs to incorporate collaborative effort from human partners to better connect human language to its own representation, but also needs to make extra collaborative effort to communicate its representation in language that humans can understand. To address these issues, the Language and Interaction Research group (LAIR) at Michigan State University has investigated multiple aspects of collaborative language grounding. This article gives a brief introduction to this research effort and discusses several collaborative approaches to grounding language to perception and action.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.