For robots to perform interaction with multiple persons, they have to be able to identify the addressees to interact with. We classify the methods of addressee detection and selection into two categories, namely, passive and active approaches. For passive approaches, the robot is programmed to detect a predefined signal, e.g., a voice command or a specific gesture, from a person who is supposed to be the addressee. In contrast, for active approaches, the robot is able to select a person as an addressee based on subtle cues that are inferred from the human pose, gaze, and facial expression. We present two new approaches for attention-based addressee selection, one is a passive method and the other is an active method. The passive method is designed for the robot to recognize common hand-waving gesture, where a Bayessian ensemble approach is proposed to fuse hand detections from depth segmentation, palm shape, skin color, and body pose. The active method is developed for the robot to perform natural interaction with multiple persons. It employs a novel human attention estimation algorithm based on human de-tection, tracking, upper body pose recogni-tion, face detection, gaze detection, lip motion analysis, and facial expression recognition. Extensive experiments have been conducted and the effectiveness of the proposed approaches is reported.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.