The platform will undergo maintenance on Sep 14 at about 7:45 AM EST and will be unavailable for approximately 2 hours.
2021
DOI: 10.48550/arxiv.2105.04806
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Deep scattering network for speech emotion recognition

Abstract: This paper introduces scattering transform for speech emotion recognition (SER). Scattering transform generates feature representations which remain stable to deformations and shifting in time and frequency without much loss of information. In speech, the emotion cues are spread across time and localised in frequency. The time and frequency invariance characteristic of scattering coefficients provides a representation robust against emotion irrelevant variations e.g., different speakers, language, gender etc. … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 20 publications
(32 reference statements)
0
1
0
Order By: Relevance
“…On the contrary, dimensional emotion can represent human emotion in a wider range than that of categorical emotion. In addition, distinguishing categorical emotion tends to cause confusion if arousal and valence levels are similar [6,7]. Therefore, in this study, we focus on the arousal and valence of dimensional emotion, and, in particular, on discrete arousal and valence tasks, since arousal and valence recognition can be designed as a regression task [8][9][10] or as a categorical task [11][12][13].…”
Section: Introductionmentioning
confidence: 99%
“…On the contrary, dimensional emotion can represent human emotion in a wider range than that of categorical emotion. In addition, distinguishing categorical emotion tends to cause confusion if arousal and valence levels are similar [6,7]. Therefore, in this study, we focus on the arousal and valence of dimensional emotion, and, in particular, on discrete arousal and valence tasks, since arousal and valence recognition can be designed as a regression task [8][9][10] or as a categorical task [11][12][13].…”
Section: Introductionmentioning
confidence: 99%