2019 IEEE/CVF International Conference on Computer Vision (ICCV) 2019
DOI: 10.1109/iccv.2019.01024
|View full text |Cite
|
Sign up to set email alerts
|

Context-Aware Emotion Recognition Networks

Abstract: Traditional techniques for emotion recognition have focused on the facial expression analysis only, thus providing limited ability to encode context that comprehensively represents the emotional responses. We present deep networks for context-aware emotion recognition, called CAER-Net, that exploit not only human facial expression but also context information in a joint and boosting manner. The key idea is to hide human faces in a visual scene and seek other contexts based on an attention mechanism. Our networ… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
134
0
1

Year Published

2020
2020
2022
2022

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 186 publications
(163 citation statements)
references
References 48 publications
0
134
0
1
Order By: Relevance
“…Moreover, the rapid development and mass production of consumer grade devices, such as smartband and smartwatch (Poongodi et al, 2020), will facilitate the integration of these signals in most of HRI systems. For example, in Lazzeri et al (2014) a multimodal acquisition platform, including a humanoid robot capable of expressing emotions, was tested in a social robot-based therapy scenario for children with autism. The system included audio and video sources, together with electrocardiogram (ECG), GSR, respiration, and accelerometer, that are integrated in a sensorized t-shirt, but the platform was designed to be flexible and reconfigurable in order to connect with various hardware devices.…”
Section: Pheripheral Physiological Responses and Multimodal Approachesmentioning
confidence: 99%
“…Moreover, the rapid development and mass production of consumer grade devices, such as smartband and smartwatch (Poongodi et al, 2020), will facilitate the integration of these signals in most of HRI systems. For example, in Lazzeri et al (2014) a multimodal acquisition platform, including a humanoid robot capable of expressing emotions, was tested in a social robot-based therapy scenario for children with autism. The system included audio and video sources, together with electrocardiogram (ECG), GSR, respiration, and accelerometer, that are integrated in a sensorized t-shirt, but the platform was designed to be flexible and reconfigurable in order to connect with various hardware devices.…”
Section: Pheripheral Physiological Responses and Multimodal Approachesmentioning
confidence: 99%
“…These researches were conducted with broad and separate focus, such as establishing dataset [11,12,25], mapping features or feature selection process(es) [26,27], designing architecture [27,28], and adopting approaches from other fields of study [25,27]. More recent studies adopted CNN for face emotion recognition, such as the approaches proposed in [29][30][31] One, if not the most, attractive feature of emotion recognition is the human face itself. The main reason behind this is because mostly human face expresses emotions one feels [32].…”
Section: Studies and Developmentsmentioning
confidence: 99%
“…In recent years, Convolutional Neural Network (CNN)based methods [14]- [16] have achieved more accurate and robust emotion recognition than previous methods with changes in surrounding information. Yu et al [15] created a model to recognize the emotions of static images, which contains three face detectors and a multiple deep CNNs module.…”
Section: Introductionmentioning
confidence: 99%
“…In visual emotion recognition, other visual cues such as body gestures, actions, and environmental contexts can show additional useful information. Thus, Lee et al [14] integrated the facial expressions and surrounding information of people with adaptive fusion networks to demonstrate that the performance of emotion recognition networks can be remarkably boosted by integrating facial and context information. In fact, the emotion recognition system can analyze features at the local pixel level, which are extracted by a specific convolution receptive field.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation