Companion Publication of the 2021 International Conference on Multimodal Interaction 2021
DOI: 10.1145/3461615.3486575
|View full text |Cite
|
Sign up to set email alerts
|

Addressing Data Scarcity in Multimodal User State Recognition by Combining Semi-Supervised and Supervised Learning

Abstract: Detecting mental states of human users is crucial for the development of cooperative and intelligent robots, as it enables the robot to understand the user's intentions and desires. Despite their importance, it is difficult to obtain a large amount of high quality data for training automatic recognition algorithms as the time and effort required to collect and label such data is prohibitively high. In this paper we present a multimodal machine learning approach for detecting dis-/agreement and confusion states… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(1 citation statement)
references
References 33 publications
(34 reference statements)
0
1
0
Order By: Relevance
“…Furthermore, AL techniques can encounter challenges with technically complex data, as designing an effective query strategy is non-trivial, and uninformative examples may be selected. Recently, Voß and colleagues [48] tackled multimodal dis-/agreement classification in human-robot interactions and YouTube videos using semi-supervised deep architectures. While their work demonstrates promising results, it still relies on a supervised branch for the final classification, necessitating a significant amount of labeled examples to generalize effectively.…”
Section: Related Workmentioning
confidence: 99%
“…Furthermore, AL techniques can encounter challenges with technically complex data, as designing an effective query strategy is non-trivial, and uninformative examples may be selected. Recently, Voß and colleagues [48] tackled multimodal dis-/agreement classification in human-robot interactions and YouTube videos using semi-supervised deep architectures. While their work demonstrates promising results, it still relies on a supervised branch for the final classification, necessitating a significant amount of labeled examples to generalize effectively.…”
Section: Related Workmentioning
confidence: 99%