In this paper, we present a framework for facilitation robots that regulate imbalanced engagement density in a four-participant conversation as the forth participant with proper procedures for obtaining initiatives. Four is the special number in multiparty conversations. In three-participant conversations, the minimum unit for multiparty conversations, social imbalance, in which a participant is left behind in the current conversation, sometimes occurs. In such scenarios, a conversational robot has the potential to objectively observe and control situations as the fourth participant. Consequently, we present model procedures for obtaining conversational initiatives in incremental steps to harmonize such four-participant conversations. During the procedures, a facilitator must be aware of both the presence of dominant participants leading the current conversation and the status of any participant that is left behind. We model and optimize these situations and procedures as a partially observable Markov decision process (POMDP), which is suitable for real-world sequential decision processes. The results of experiments conducted to evaluate the proposed procedures show evidence of their acceptability and feeling of groupness.
SUMMARYDialogue between human subjects by voice is based on linguistic information contained in the utterance. In addition, the psychological state of the utterer and information complementing the dialogue are represented by prosody, facial expression, and head movement, making the dialogue proceed smoothly. Such information, which cooccurs with the utterance and supports the smooth transmission of linguistic information, is called paralinguistic information. This paper considers the attitude of the utterer as represented by prosody and head gestures as paralinguistic information. Methods of recognizing such respective information are proposed, and a dialogue robot is realized on the basis of the proposed method. In the recognition of the utterance attitude by prosody, the positive or negative attitude of the utterance is recognized on the basis of F 0 pattern and the phoneme duration. In the recognition of head gestures, a nod is defined as representing a positive attitude and a tilt or shake of the head as representing a negative attitude. These three motions are recognized with the optical flow as the feature parameters, using HMM as a stochastic model. It is shown experimentally that the proposed method achieves the same recognition ability as humans. It is also shown that a dialogue robot incorporating the proposed method achieves a rhythmic, efficient dialogue, which has not been the case in the past.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.