2021
DOI: 10.1109/access.2021.3057373
|View full text |Cite
|
Sign up to set email alerts
|

SCEP—A New Image Dimensional Emotion Recognition Model Based on Spatial and Channel-Wise Attention Mechanisms

Abstract: Images are an important carrier for emotional expression. Human can understand emotions in image easily and quickly, whereas it is a very challenging task for machines to extract accurate emotions. In this study, we propose a novel spatial and channel-wise attention-based emotion prediction model, SCEP, to assist computers in recognizing the emotions of images more accurately. SCEP integrates both spatial attention and channel-wise weight mechanisms into a classical convolutional neural network (CNN) layer str… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
8
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 15 publications
(8 citation statements)
references
References 44 publications
(57 reference statements)
0
8
0
Order By: Relevance
“…However, these methods concentrate only on building deep networks to increase the models' representation capability, which results in high computation and memory demands. Besides, in most cases, the conventional spatial attention mechanism [45] only provides one-direction weight allocation [12]- [14], which results in the loss of vital information up to a specific level.…”
Section: A Adjacent Attention Blockmentioning
confidence: 99%
“…However, these methods concentrate only on building deep networks to increase the models' representation capability, which results in high computation and memory demands. Besides, in most cases, the conventional spatial attention mechanism [45] only provides one-direction weight allocation [12]- [14], which results in the loss of vital information up to a specific level.…”
Section: A Adjacent Attention Blockmentioning
confidence: 99%
“…Samara et al [18] proposed a hierarchical machine learning method for facial expression-based affective state recognition, which employs an Euclidean distance-based feature representation, conjointly with a customized encoding for users' self-reported affective states. Ren et al [19] proposed a novel spatial and channel-wise attention-based emotion prediction model, which integrates both spatial attention and channel-wise weight mechanisms into a CNN layer structure to predict image emotions, and finally output the emotion values in a continuous 2-D valence and arousal space.…”
Section: Traditional Facial Expression Recognitionmentioning
confidence: 99%
“…Ren et al. [19] proposed a novel spatial and channel‐wise attention‐based emotion prediction model, which integrates both spatial attention and channel‐wise weight mechanisms into a CNN layer structure to predict image emotions, and finally output the emotion values in a continuous 2‐D valence and arousal space.…”
Section: Related Workmentioning
confidence: 99%
“…Zhao et al explored the spatial connectivity patterns and interdependency between channels through spatialwise attention and channelwise attention [ 32 ]. Li et al employed spatial attention to enhance the contrast between salient and irrelevant regions and adopted channel attention to emphasize informative features [ 33 ]. Ding et al proposed pyramid spatial attention and pyramid channel attention to locate discriminative regions [ 34 ].…”
Section: Related Workmentioning
confidence: 99%