2021
DOI: 10.1016/j.future.2021.05.029
|View full text |Cite
|
Sign up to set email alerts
|

Semi-supervised classification-aware cross-modal deep adversarial data augmentation

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
9

Relationship

0
9

Authors

Journals

citations
Cited by 10 publications
(7 citation statements)
references
References 32 publications
0
6
0
Order By: Relevance
“…To conclude, although, in the literature, other works also perform multimodal emotion recognition on RAVDESS, such as Wang et al [66], that used facial images to generate spectrograms, which were then used as data augmentation to improve the SER model performance in six emotions; to our knowledge, our work is the first that evaluates a late fusion strategy using the visual information of RAVDESS for facial emotion recognition using the eight emotions of the dataset with a pre-trained STN and the aural modality.…”
Section: Multimodal Emotion Recognitionmentioning
confidence: 96%
“…To conclude, although, in the literature, other works also perform multimodal emotion recognition on RAVDESS, such as Wang et al [66], that used facial images to generate spectrograms, which were then used as data augmentation to improve the SER model performance in six emotions; to our knowledge, our work is the first that evaluates a late fusion strategy using the visual information of RAVDESS for facial emotion recognition using the eight emotions of the dataset with a pre-trained STN and the aural modality.…”
Section: Multimodal Emotion Recognitionmentioning
confidence: 96%
“…e Yolo v4 algorithm through the CSPDarknet53 network features extract the image in S × S grid, target detection through the target center in the grid, using the residual network sampling and sampling features, the maximum pooling of different scales after stacking, finally after the size of the target category and position [28].…”
Section: Yolo V4mentioning
confidence: 99%
“…To sum up, despite the fact that other works in the literature also performed multimodal emotion recognition on RAVDESS, such as Wang et al [ 33 ], who used facial images to generate spectrograms, which were then used for data augmentation to improve the SER model performance in six emotions, our work is the first that, to our knowledge, detects the stressed and relaxed state using the audio-visual information of RAVDESS by means of aural and facial emotion recognition using the eight emotions.…”
Section: Literature Reviewmentioning
confidence: 99%