2020
DOI: 10.48550/arxiv.2001.01168
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Spatio-Temporal Relation and Attention Learning for Facial Action Unit Detection

Abstract: Spatio-temporal relations among facial action units (AUs) convey significant information for AU detection yet have not been thoroughly exploited. The main reasons are the limited capability of current AU detection works in simultaneously learning spatial and temporal relations, and the lack of precise localization information for AU feature learning. To tackle these limitations, we propose a novel spatio-temporal relation and attention learning framework for AU detection. Specifically, we introduce a spatiotem… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

5
16
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 10 publications
(21 citation statements)
references
References 25 publications
5
16
0
Order By: Relevance
“…Popular methods include using geometric features that track facial landmarks [7], histogram-based approaches that cluster local features into uniform regions for processing [32] or using features that describe local neighbourhoods [2]. With the popularity of deep learning, CNN [11,14,21] and graph-based [19,36] approaches have achieved stateof-the-art results for AU detection due to their ability to hierarchically learn spatial features. Capsule-based computations [31] offer an improvement as along with learning to detect different facial features, they also learn how these are arranged with respect to each other.…”
Section: Spatial Analysis For Au Predictionmentioning
confidence: 99%
See 4 more Smart Citations
“…Popular methods include using geometric features that track facial landmarks [7], histogram-based approaches that cluster local features into uniform regions for processing [32] or using features that describe local neighbourhoods [2]. With the popularity of deep learning, CNN [11,14,21] and graph-based [19,36] approaches have achieved stateof-the-art results for AU detection due to their ability to hierarchically learn spatial features. Capsule-based computations [31] offer an improvement as along with learning to detect different facial features, they also learn how these are arranged with respect to each other.…”
Section: Spatial Analysis For Au Predictionmentioning
confidence: 99%
“…However, their approach focuses on extracting spatio-temporal features from complete video sequences at once, dropping certain adjacent frames to ensure all video sequences are of the same length. Other more recent approaches learn semantic relationships between different face regions and represent these using structured knowledge-graphs, learning coupling patterns between regions using graph-based computations [19,36].…”
Section: Spatio-temporal Analysis For Au Predictionmentioning
confidence: 99%
See 3 more Smart Citations