Facial expression recognition plays a key role in human-computer emotional interaction. However, human faces in real environments are affected by various unfavorable factors, which will result in the reduction of expression recognition accuracy. In this paper, we proposed a novel method which combines Fine-tuning Swin Transformer and Multiple Weights Optimality-seeking (FST-MWOS) to enhanced expression recognition performance. FST-MWOS mainly consists of two crucial components: Fine-tuning Swin Transformer (FST) and Multiple Weights Optimality-seeking (MWOS). FST takes Swin Transformer Large as the backbone network to obtain multiple groups of fine-tuned model weights for the homologous data domains by hyperparameters configurations, data augmentation methods, etc. In MWOS a greedy strategy was used to mine locally optimal generalizations in the optimal epoch interval of each group of fine-tuned model weights. Then, the optimality-seeking for multiple groups of locally optimal weights was utilized to obtain the global optimal solution. Experiments results on RAF-DB, FERPlus and AffectNet datasets show that the proposed FST-MWOS method outperforms various state-of-the-art methods.
Objective. Convolutional Neural Networks(CNN) have been widely adopted for medical image segmentation with their outstanding feature representation capabilities. As the segmentation accuracy gets constantly updated, the complexity of networks increases as well. Complex networks can achieve better performance but require more parameters and are hard to train with limited resources, while lightweight models are faster but cannot fully utilize the contextual information of medical images. In this paper, we focus on better balancing the efficiency and accuracy. Approach. We propose a correlation-enhanced lightweight network (CeLNet) for medical image segmentation, which adopts a siamese structure for weight sharing and parameter saving. Through the feature reuse and feature stacking of parallel branches, a point-depth convolution parallel block (PDP Block) is proposed to reduce the model parameters and computational cost while improving the feature extraction capability of encoder. A relation module is also designed to extract feature correlations of input slices, which utilizes global and local attention to enhance feature connections, while reducing feature differences through element subtraction, and finally obtains contextual information of associated slices to improve the segmentation performance. Main results. We conduct extensive experiments on the LiTS2017, MM-WHS and ISIC2018 datasets, and the proposed model consumes merely 5.18M parameters but achieves excellent segmentation performance, specifically, a DSC of 0.9233 in LiTS2017 dataset, an average DSC of 0.7895 on MM-WHS dataset and an average DSC of 0.8401 on ISIC2018 dataset. Significance. CeLNet achieves state-of-the-art performance in multiple datasets while ensuring lightweight.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.