2021
DOI: 10.3390/s21051579
|View full text |Cite
|
Sign up to set email alerts
|

Multi-Path and Group-Loss-Based Network for Speech Emotion Recognition in Multi-Domain Datasets

Abstract: Speech emotion recognition (SER) is a natural method of recognizing individual emotions in everyday life. To distribute SER models to real-world applications, some key challenges must be overcome, such as the lack of datasets tagged with emotion labels and the weak generalization of the SER model for an unseen target domain. This study proposes a multi-path and group-loss-based network (MPGLN) for SER to support multi-domain adaptation. The proposed model includes a bidirectional long short-term memory-based t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
3
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 15 publications
(3 citation statements)
references
References 46 publications
(111 reference statements)
0
3
0
Order By: Relevance
“…Despite the enormous success contributions in emotion recognition in English datasets, there is still gab in Arabic dataset and emotion recognition systems utilizes these Arabic datasets. Some Arabic speeches emotion datasets have been proposed in the literature, see [1]- [3], [5], [19]. Each dataset has a different set of classes or labels, for example, the Arabic audio acted dataset proposed in [20] has five labels (Happiness, Sadness, Neutral, Anger, Fear), and the dataset proposed in [15] has three classes (Happy, Surprised, and Angry), while the dataset proposed in [19] has labels (Happy, Sad, Neutral, Angry, Surprise, Disgust).…”
Section: Arabic Baved Datasetmentioning
confidence: 99%
See 1 more Smart Citation
“…Despite the enormous success contributions in emotion recognition in English datasets, there is still gab in Arabic dataset and emotion recognition systems utilizes these Arabic datasets. Some Arabic speeches emotion datasets have been proposed in the literature, see [1]- [3], [5], [19]. Each dataset has a different set of classes or labels, for example, the Arabic audio acted dataset proposed in [20] has five labels (Happiness, Sadness, Neutral, Anger, Fear), and the dataset proposed in [15] has three classes (Happy, Surprised, and Angry), while the dataset proposed in [19] has labels (Happy, Sad, Neutral, Angry, Surprise, Disgust).…”
Section: Arabic Baved Datasetmentioning
confidence: 99%
“…Despite the enormous success contributions in emotion recognition in English datasets, there is still a gab in Arabic dataset and emotion recognition systems utilizes these Arabic datasets. Various Arabic speeches emotion datasets have been proposed in the literature, whether audio or visual, see [1]- [4].…”
Section: Introduction Researchers and Scientists Have Used Deep Learn...mentioning
confidence: 99%
“…This issue also includes a speech-emotion-recognition study [ 6 ] that proposed a multi-path and group-loss-based network (MPGLN) for emotion recognition to support multi-domain adaptation. The authors proposed a model that includes a bidirectional long short-term memory-based temporal feature generator and a transferred feature extractor from the pre-trained VGG-like audio classification model (VGGish).…”
mentioning
confidence: 99%