2017 IEEE Winter Conference on Applications of Computer Vision (WACV) 2017
DOI: 10.1109/wacv.2017.140
|View full text |Cite
|
Sign up to set email alerts
|

Deep Spatio-Temporal Features for Multimodal Emotion Recognition

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
57
0

Year Published

2017
2017
2023
2023

Publication Types

Select...
4
3
2

Relationship

1
8

Authors

Journals

citations
Cited by 60 publications
(65 citation statements)
references
References 27 publications
0
57
0
Order By: Relevance
“…A. Pre-processing All frames are initially extracted from visual signal for further steps. Since such extracted frames still contain considerable redundant information for emotion detection, we extract only the face regions using the simple algorithm [1] as follows:…”
Section: Proposed Methodologymentioning
confidence: 99%
“…A. Pre-processing All frames are initially extracted from visual signal for further steps. Since such extracted frames still contain considerable redundant information for emotion detection, we extract only the face regions using the simple algorithm [1] as follows:…”
Section: Proposed Methodologymentioning
confidence: 99%
“…Tran et al [107] [110] [78], [111] [17], [78] [91], [112] † number of convolutional layers + fully connected layers size of the convolution kernel proposed the well-designed C3D, which exploits 3D convolutions on large-scale supervised training datasets to learn spatio-temporal features. Many related studies (e.g., [108], [109]) have employed this network for FER involving image sequences.…”
Section: Convolutional Neural Network (Cnn)mentioning
confidence: 99%
“…sequence and weighted based on their prediction scores. Instead of directly using C3D for classification, [109] employed C3D for spatio-temporal feature extraction and then cascaded with DBN for prediction. In [201], C3D was also used as a feature extractor, followed by a NetVLAD layer [202] to aggregate the temporal information of the motion features by learning cluster centers.…”
Section: Rnn and C3dmentioning
confidence: 99%
“…In recent years, deep learning has become a new classifier in many emotion recognition tasks. Nguyen et al introduce a novel approach using 3-dimensional convolutional neural networks (C3Ds) and multimodal deepbelief networks (DBNs) to improve the performance of multimodal emotion recognition [9].…”
Section: Related Workmentioning
confidence: 99%