2019
DOI: 10.1080/02564602.2019.1645620
|View full text |Cite
|
Sign up to set email alerts
|

Video-Based Facial Expression Recognition using Deep Temporal–Spatial Networks

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 12 publications
(3 citation statements)
references
References 20 publications
0
3
0
Order By: Relevance
“…Fan et al [23] fused discriminative features extracted by CNN model with features containing shape and appearance extracted by hand. Pan et al [24] designed a Deep Temporal-Spatial Network based on facial expressions to extract the spatiotemporal features. Chen et al [25] proposed a facial feature called deep peak-calmness difference (DPND), which can characterize facial regions that change from calmness to expressive face, and achieved high-quality results in both unsupervised clustering and semi-supervised classification methods.…”
Section: A Emotion Recognition With Facial Expressionsmentioning
confidence: 99%
“…Fan et al [23] fused discriminative features extracted by CNN model with features containing shape and appearance extracted by hand. Pan et al [24] designed a Deep Temporal-Spatial Network based on facial expressions to extract the spatiotemporal features. Chen et al [25] proposed a facial feature called deep peak-calmness difference (DPND), which can characterize facial regions that change from calmness to expressive face, and achieved high-quality results in both unsupervised clustering and semi-supervised classification methods.…”
Section: A Emotion Recognition With Facial Expressionsmentioning
confidence: 99%
“…English education resource libraries of Arduino device images collected on-site are selected as experimental data, including Arduino UNO board, L298 N drive board, Hall sensor, active buzzer, rocker, serial wireless transparent transmission module, PIR human body sensor, potential device, and ultrasonic module and LCD [18,19]. Each device collects 500 images, and a total of 5000 images are collected.…”
Section: Data Sources and Preprocessing In This Experiment 10mentioning
confidence: 99%
“…They also proposed the use of average human face as a substitute to the neutral face for the cases where neutral face was not available for reference. Pan et al [41] utilized the magnitude of the optical flow between successive frames in a video to characterize their relative motion as a form of a temporal channel in their spatiotemporal video-based FER model.…”
Section: Optical Flow and Fermentioning
confidence: 99%