2023
DOI: 10.1109/access.2023.3237817
|View full text |Cite
|
Sign up to set email alerts
|

Fine-Tuning Swin Transformer and Multiple Weights Optimality-Seeking for Facial Expression Recognition

Abstract: Facial expression recognition plays a key role in human-computer emotional interaction. However, human faces in real environments are affected by various unfavorable factors, which will result in the reduction of expression recognition accuracy. In this paper, we proposed a novel method which combines Fine-tuning Swin Transformer and Multiple Weights Optimality-seeking (FST-MWOS) to enhanced expression recognition performance. FST-MWOS mainly consists of two crucial components: Fine-tuning Swin Transformer (FS… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
1
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 11 publications
(4 citation statements)
references
References 34 publications
(45 reference statements)
0
1
0
Order By: Relevance
“…For these comparisons, we drew from an array of notable works, namely LibreFace [65], SSA-ICL [87], ECAN [88], A-MobileNet [89], DNFER [90], Muhamad et al [91], Xiaoyu et al [92], and NCCTFER [93]. Furthermore, we considered additional works such as FST-MWOS [94] and Sunyoung et al for the FER2013+ dataset. We utilized accuracy as our primary evaluation metric, with the results showcased in Tables 9 and 10.…”
Section: B Facial Expression Recognition Results On Fer2013+ and Raf-...mentioning
confidence: 99%
“…For these comparisons, we drew from an array of notable works, namely LibreFace [65], SSA-ICL [87], ECAN [88], A-MobileNet [89], DNFER [90], Muhamad et al [91], Xiaoyu et al [92], and NCCTFER [93]. Furthermore, we considered additional works such as FST-MWOS [94] and Sunyoung et al for the FER2013+ dataset. We utilized accuracy as our primary evaluation metric, with the results showcased in Tables 9 and 10.…”
Section: B Facial Expression Recognition Results On Fer2013+ and Raf-...mentioning
confidence: 99%
“…The classifiers are random forest (RF), logistic regression (LR), support vector machine (SVM), CNN, LSTM, and Bi-LSTM. The reason why we chose these ML and DL models is that they presented a significant performance in similar NLP and text mining tasks ( Malik et al, 2023 ; Rehan, Malik & Jamjoom, 2023 ). The following comparable models are designed:…”
Section: Methodsmentioning
confidence: 99%
“…In [25], two Transformer network frameworks are designed to extract facial information and motion information of face images and the features obtained from both are combined for classification. The study in [26] uses multiple Swin Transformers in parallel to obtain different weights by modifying the hyper parameters to capture different facial information for better discrimination of facial expression. In [27], the same image is divided into two patches with different sizes, each of which is fed into a separate Transformer network to extract features, and the information at different scales is fused using crossattention to achieve a competitive result.…”
Section: Related Workmentioning
confidence: 99%