2022
DOI: 10.1109/taffc.2022.3197761
|View full text |Cite
|
Sign up to set email alerts
|

Disentangling Identity and Pose for Facial Expression Recognition

Abstract: Facial expression recognition (FER) is a challenging problem because the expression component is always entangled with other irrelevant factors, such as identity and head pose. In this work, we propose an identity and pose disentangled facial expression recognition (IPD-FER) model to learn more discriminative feature representation. We regard the holistic facial representation as the combination of identity, pose and expression. These three components are encoded with different encoders. For identity encoder, … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
6
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
5
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 19 publications
(8 citation statements)
references
References 46 publications
0
6
0
Order By: Relevance
“…Comparison on FERPlus dataset is shown in Table II. It can been seen that our FER-former achieved best FER performance compared to other methods, including FER with unconstrained variations (RAN [11], IPD-FER [58]), and FER…”
Section: A Comparison With State-of-the-art Methodsmentioning
confidence: 92%
See 3 more Smart Citations
“…Comparison on FERPlus dataset is shown in Table II. It can been seen that our FER-former achieved best FER performance compared to other methods, including FER with unconstrained variations (RAN [11], IPD-FER [58]), and FER…”
Section: A Comparison With State-of-the-art Methodsmentioning
confidence: 92%
“…For RAF-DB and FERPlus datasets, the pre-trained IR-50 [51] on Ms-Celeb-1M [52] is adopted as a feature extractor, which is consistent with TranFER [23]. For SFEW 2.0 dataset, we pre-train FER-former on RAF-DB dataset and then finetune it on SFEW 2.0 dataset, which is consistent with IPD-FER [58]. Regarding our scratch-trained Transformer encoder, the depth is 16, the embedding dimension is 256, the number of heads is 4, and the mlp ratio is 4.…”
Section: Methodsmentioning
confidence: 93%
See 2 more Smart Citations
“…Ruan et al [ 6 ] learned intraclass features and interclass features by decomposing and reconstructing. Jiang et al [ 30 ] proposed an identity and pose disentangled method, which separates expression features from the identity and pose.…”
Section: Related Workmentioning
confidence: 99%