Deep Spatio-Temporal Features for Multimodal Emotion Recognition

Nguyen, Dung; Nguyen, Kien; Sridharan, Sridha; Ghasemi, Afsane; Fookes, Clinton

doi:10.1109/wacv.2017.140

Cited by 60 publications

(65 citation statements)

References 27 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…A. Pre-processing All frames are initially extracted from visual signal for further steps. Since such extracted frames still contain considerable redundant information for emotion detection, we extract only the face regions using the simple algorithm [1] as follows:…”

Section: Proposed Methodologymentioning

confidence: 99%

Meta Transfer Learning for Facial Emotion Recognition

Nguyen

Sridharan

et al. 2018

2018 24th International Conference on Pattern Recognition (ICPR)

Self Cite

View full text Add to dashboard Cite

The use of deep learning techniques for automatic facial expression recognition has recently attracted great interest but developed models are still unable to generalize well due to the lack of large emotion datasets for deep learning. To overcome this problem, in this paper, we propose utilizing a novel transfer learning approach relying on PathNet and investigate how knowledge can be accumulated within a given dataset and how the knowledge captured from one emotion dataset can be transferred into another in order to improve the overall performance. To evaluate the robustness of our system, we have conducted various sets of experiments on two emotion datasets: SAVEE and eNTERFACE. The experimental results demonstrate that our proposed system leads to improvement in performance of emotion recognition and performs significantly better than the recent state-of-the-art schemes adopting fine-tuning/pre-trained approaches.

show abstract

Section: Proposed Methodologymentioning

confidence: 99%

Meta Transfer Learning for Facial Emotion Recognition

Nguyen

Sridharan

et al. 2018

2018 24th International Conference on Pattern Recognition (ICPR)

Self Cite

View full text Add to dashboard Cite

show abstract

“…Tran et al [107] [110] [78], [111] [17], [78] [91], [112] † number of convolutional layers + fully connected layers size of the convolution kernel proposed the well-designed C3D, which exploits 3D convolutions on large-scale supervised training datasets to learn spatio-temporal features. Many related studies (e.g., [108], [109]) have employed this network for FER involving image sequences.…”

Section: Convolutional Neural Network (Cnn)mentioning

confidence: 99%

“…sequence and weighted based on their prediction scores. Instead of directly using C3D for classification, [109] employed C3D for spatio-temporal feature extraction and then cascaded with DBN for prediction. In [201], C3D was also used as a feature extractor, followed by a NetVLAD layer [202] to aggregate the temporal information of the motion features by learning cluster centers.…”

Section: Rnn and C3dmentioning

confidence: 99%

Deep Facial Expression Recognition: A Survey

Deng

2022

IEEE Trans. Affective Comput.

1,015

593

View full text Add to dashboard Cite

With the transition of facial expression recognition (FER) from laboratory-controlled to challenging in-the-wild conditions and the recent success of deep learning techniques in various fields, deep neural networks have increasingly been leveraged to learn discriminative representations for automatic FER. Recent deep FER systems generally focus on two important issues: overfitting caused by a lack of sufficient training data and expression-unrelated variations, such as illumination, head pose and identity bias. In this paper, we provide a comprehensive survey on deep FER, including datasets and algorithms that provide insights into these intrinsic problems. First, we introduce the available datasets that are widely used in the literature and provide accepted data selection and evaluation principles for these datasets. We then describe the standard pipeline of a deep FER system with the related background knowledge and suggestions of applicable implementations for each stage. For the state of the art in deep FER, we review existing novel deep neural networks and related training strategies that are designed for FER based on both static images and dynamic image sequences, and discuss their advantages and limitations. Competitive performances on widely used benchmarks are also summarized in this section. We then extend our survey to additional related issues and application scenarios. Finally, we review the remaining challenges and corresponding opportunities in this field as well as future directions for the design of robust deep FER systems.

show abstract

“…In recent years, deep learning has become a new classifier in many emotion recognition tasks. Nguyen et al introduce a novel approach using 3-dimensional convolutional neural networks (C3Ds) and multimodal deepbelief networks (DBNs) to improve the performance of multimodal emotion recognition [9].…”

Section: Related Workmentioning

confidence: 99%

Imbalance Learning-based Framework for Fear Recognition in the MediaEval Emotional Impact of Movies Task

Zhang

Cheng

et al. 2018

Interspeech 2018

View full text Add to dashboard Cite

Fear recognition, which aims at predicting whether a movie segment can induce fear or not, is a promising area in movie emotion recognition. Research in this area, however, has reached a bottleneck. Difficulties may partly result from the imbalanced database. In this paper, we propose an imbalance learning-based framework for movie fear recognition. A data rebalance module is adopted before classification. Several sampling methods, including the proposed softsampling and hardsampling which combine the merits of both undersampling and oversampling, are explored in this module. Experiments are conducted on the MediaEval 2017 Emotional Impact of Movies Task. Compared with the current state-of-the-art, we achieve an improvement of 8.94% on F1, proving the effectiveness of proposed framework.

show abstract

Deep Spatio-Temporal Features for Multimodal Emotion Recognition

Cited by 60 publications

References 27 publications

Meta Transfer Learning for Facial Emotion Recognition

Meta Transfer Learning for Facial Emotion Recognition

Deep Facial Expression Recognition: A Survey

Imbalance Learning-based Framework for Fear Recognition in the MediaEval Emotional Impact of Movies Task

Contact Info

Product

Resources

About