An Open-Source Dialog System with Real-Time Engagement Tracking for Job Interview Training Applications

Yu, Zhou; Ramanarayanan, Vikram; Lange, Patrick; Suendermann-Oeft, David

doi:10.1007/978-3-319-92108-2_21

Cited by 21 publications

(27 citation statements)

References 12 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…location, trajectory, distance to the robot) [18,26,33], eye-gaze behaviors (e.g. looking at the agent, mutual gaze) [18,22,[34][35][36], facial information (e.g. facial movement, expression, head pose) [34,36], conversational behaviors (e.g.…”

Section: B) Engagement Recognitionmentioning

confidence: 99%

“…looking at the agent, mutual gaze) [18,22,[34][35][36], facial information (e.g. facial movement, expression, head pose) [34,36], conversational behaviors (e.g. voice activity, adjacency pair, backchannel, turn length) [18,35,37], laughing [38], and posture [39].…”

Section: B) Engagement Recognitionmentioning

confidence: 99%

“…voice activity, adjacency pair, backchannel, turn length) [18,35,37], laughing [38], and posture [39]. Engagement recognition modules based on the multi-modal features were implemented in agent systems and empirically tested with real users [36]. For human-human interaction, it was also revealed that the effective features in dyadic engagement recognition by a latent character model conversations were acoustic information [30,40,41], facial information [40,41], and low-level image features (e.g.…”

Section: B) Engagement Recognitionmentioning

confidence: 99%

See 2 more Smart Citations

Engagement recognition by a latent character model based on multimodal listener behaviors in spoken dialogue

Inoue

Lala

Takanashi

et al. 2018

SIP

View full text Add to dashboard Cite

Engagement represents how much a user is interested in and willing to continue the current dialogue. Engagement recognition will provide an important clue for dialogue systems to generate adaptive behaviors for the user. This paper addresses engagement recognition based on multimodal listener behaviors of backchannels, laughing, head nodding, and eye gaze. In the annotation of engagement, the ground-truth data often differs from one annotator to another due to the subjectivity of the perception of engagement. To deal with this, we assume that each annotator has a latent character that affects his/her perception of engagement. We propose a hierarchical Bayesian model that estimates both engagement and the character of each annotator as latent variables. Furthermore, we integrate the engagement recognition model with automatic detection of the listener behaviors to realize online engagement recognition. Experimental results show that the proposed model improves recognition accuracy compared with other methods which do not consider the character such as majority voting. We also achieve online engagement recognition without degrading accuracy.

show abstract

Section: B) Engagement Recognitionmentioning

confidence: 99%

Section: B) Engagement Recognitionmentioning

confidence: 99%

Section: B) Engagement Recognitionmentioning

confidence: 99%

See 1 more Smart Citation

Engagement recognition by a latent character model based on multimodal listener behaviors in spoken dialogue

Inoue

Lala

Takanashi

et al. 2018

SIP

View full text Add to dashboard Cite

show abstract

“…For example, by recognizing user engagement, the systems can control turntaking behaviors [12,13] and dialogue policies [14,15,16], and increase the quality of user experience through the dialogue. For input features of engagement recognition, we can exploit non-verbal multimodal behaviors such as eye-gaze [17,18,19,20,12,21,15], backchannels (e.g., "yeah") [19,21], laughing [22], head nodding [21], facial movement and direction [17,15], spatial location and distance [23,24,12], and conversational interaction features like adjacency pairs [19]. In addition, direct use of low-level signals such as acoustic and image features was explored [10,25,26,27].…”

Section: Introductionmentioning

confidence: 99%

“…In addition, direct use of low-level signals such as acoustic and image features was explored [10,25,26,27]. Although such recognition models were initially based on heuristic rules [9,28,23], recent approaches are based on machine learning techniques [10,12,21,29,26,15,27].…”

Section: Introductionmentioning

confidence: 99%

Engagement Recognition in Spoken Dialogue via Neural Network by Aggregating Different Annotators' Models

Inoue¹,

Lala²,

Takanashi³

et al. 2018

Interspeech 2018

View full text Add to dashboard Cite

This paper addresses engagement recognition based on four multimodal listener behaviors-backchannels, laughing, eyegaze, and head nodding. Engagement is an indicator of how much a user is interested in the current dialogue. Multiple third-party annotators give ground truth labels of engagement in a human-robot interaction corpus. Since perception of engagement is subjective, the annotations are sometimes different between individual annotators. Conventional methods directly use integrated labels, such as those generated through simple majority voting, and do not consider each annotator's recognition. We propose a two-step engagement recognition where each annotator's recognition is modeled and the different annotators' models are aggregated to recognize the integrated label. The proposed neural network consists of two parts. The first part corresponds to each annotator's model which is trained with the corresponding labels independently. The second part aggregates the different annotators' models to obtain one integrated label. After each part is pre-trained, the whole network is fine-tuned through back-propagation of prediction errors. Experimental results show that the proposed network outperforms baseline models which directly recognize the integrated label without considering differing annotations.

show abstract

Group Cognition and Collaborative AI

Koch

Oulasvirta

2018

Human–Computer Interaction Series

View full text Add to dashboard Cite

An Open-Source Dialog System with Real-Time Engagement Tracking for Job Interview Training Applications

Cited by 21 publications

References 12 publications

Engagement recognition by a latent character model based on multimodal listener behaviors in spoken dialogue

Engagement recognition by a latent character model based on multimodal listener behaviors in spoken dialogue

Engagement Recognition in Spoken Dialogue via Neural Network by Aggregating Different Annotators' Models

Group Cognition and Collaborative AI

Contact Info

Product

Resources

About