2021
DOI: 10.48550/arxiv.2107.03670
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Feature Pyramid Network for Multi-task Affective Analysis

Ruian He,
Zhen Xing,
Weimin Tan
et al.

Abstract: Affective Analysis is not a single task, and the valencearousal value, expression class, and action unit can be predicted at the same time. Previous researches did not pay enough attention to the entanglement and hierarchical relation of these three facial attributes. We propose a novel model named feature pyramid networks for multi-task affect analysis. The hierarchical features are extracted to predict three labels and we apply a teacher-student training strategy to learn from pretrained single-task models. … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
1
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 26 publications
0
1
0
Order By: Relevance
“…Since, we have trained model for single task and not used any of audio and video features, so performance is not as good as teams using multi-task learning with video features. [27] 0.29 0.6491 0.4082 NTUA-CVSP [28] 0.3367 0.6418 0.4374 Morphoboid [29] 0.3511 0.668 0.4556 FLAB2021 [30] 0.4079 0.6729 0.4953 STAR [31] 0.4759 0.7321 0.5604 Maybe Next Time [32] 0.6046 0.7289 0.6456 CPIC-DIR2021 [33] 0.6834 0.7709 0.7123 Netease Fuxi Virtual Human [34] 0.763 0.8059 0.7777 Ours [18] 0.361 0.675 0.4646 Table 3 shows the influence of number of networks that are collaboratively trained in CCT. It can be observed that model with 3 networks performs the best in the presence of noise.…”
Section: Performance Comparison With State-of-the-art Methodsmentioning
confidence: 99%
“…Since, we have trained model for single task and not used any of audio and video features, so performance is not as good as teams using multi-task learning with video features. [27] 0.29 0.6491 0.4082 NTUA-CVSP [28] 0.3367 0.6418 0.4374 Morphoboid [29] 0.3511 0.668 0.4556 FLAB2021 [30] 0.4079 0.6729 0.4953 STAR [31] 0.4759 0.7321 0.5604 Maybe Next Time [32] 0.6046 0.7289 0.6456 CPIC-DIR2021 [33] 0.6834 0.7709 0.7123 Netease Fuxi Virtual Human [34] 0.763 0.8059 0.7777 Ours [18] 0.361 0.675 0.4646 Table 3 shows the influence of number of networks that are collaboratively trained in CCT. It can be observed that model with 3 networks performs the best in the presence of noise.…”
Section: Performance Comparison With State-of-the-art Methodsmentioning
confidence: 99%
“…As a recent development, face recognition has become an integral part of social cognition and has been used in various requests, pedestrian tracking, and surveillance systems. There has been significant progress in facial recognition using DCNN [10].…”
Section: Introductionmentioning
confidence: 99%