2007
DOI: 10.1109/tmm.2006.886310
|View full text |Cite
|
Sign up to set email alerts
|

Audio-Visual Affect Recognition

Abstract: Abstract-The ability of a computer to detect and appropriately respond to changes in a user's affective state has significant implications to Human-Computer Interaction (HCI). In this paper, we present our efforts toward audio-visual affect recognition on 11 affective states customized for HCI application (four cognitive/motivational and seven basic affective states) of 20 nonactor subjects. A smoothing method is proposed to reduce the detrimental influence of speech on facial expression recognition. The featu… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
45
0
2

Year Published

2009
2009
2017
2017

Publication Types

Select...
5
3
2

Relationship

0
10

Authors

Journals

citations
Cited by 122 publications
(47 citation statements)
references
References 16 publications
(13 reference statements)
0
45
0
2
Order By: Relevance
“…Facial expressions [20], [21], [22], vocal features [23] [24] [25], body movements and postures [26], [27], [11], [28], physiological signals [29] have been used as inputs during these attempts, although multimodal emotion recognition is currently gaining ground [7], [30], [31], [32], [33]. Nevertheless, most of the work has considered the integration of information from facial expressions and speech [34], [35] and there have been relatively few attempts to combine information from body movement and gestures in a multimodal framework. Gunes and Piccardi [8], for example, fused facial expressions and body gestures at different levels for bimodal emotion recognition.…”
Section: Related Workmentioning
confidence: 99%
“…Facial expressions [20], [21], [22], vocal features [23] [24] [25], body movements and postures [26], [27], [11], [28], physiological signals [29] have been used as inputs during these attempts, although multimodal emotion recognition is currently gaining ground [7], [30], [31], [32], [33]. Nevertheless, most of the work has considered the integration of information from facial expressions and speech [34], [35] and there have been relatively few attempts to combine information from body movement and gestures in a multimodal framework. Gunes and Piccardi [8], for example, fused facial expressions and body gestures at different levels for bimodal emotion recognition.…”
Section: Related Workmentioning
confidence: 99%
“…In these datasets, interactions include two interlocutors, who are recorded carrying out both structured and unstructured conversations. ere are numerous examples of such datasets, with a wide range of applications, such as speech recognition [15], behavior analysis [50], segmentation, emotion recognition [12] and depression detection [16]. Arguably, one of the most popular datasets of one-to-one interactions is SEMAINE [30].…”
Section: Related Workmentioning
confidence: 99%
“…A number of studies favor decision-level fusion as the preferred method of data fusion because errors from different classifiers tend to be uncorrelated and the methodology is feature-independent [66]. Bimodal fusion methods have been proposed in numerous instances [12,67,68], but optimal information fusion configurations remain elusive.…”
Section: Multimodal Fusionmentioning
confidence: 99%