2024
DOI: 10.3390/app14125227
|View full text |Cite
|
Sign up to set email alerts
|

Speech Emotion Recognition under Noisy Environments with SNR Down to −6 dB Using Multi-Decoder Wave-U-Net

Hyun-Joon Nam,
Hong-June Park

Abstract: A speech emotion recognition (SER) model for noisy environments is proposed, by using four band-pass filtered speech waveforms as the model input instead of the simplified input features such as MFCC (Mel Frequency Cepstral Coefficients). The four waveforms retain the entire information of the original noisy speech while the simplified features keep only partial information of the noisy speech. The information reduction at the model input may cause the accuracy degradation under noisy environments. A normalize… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Publication Types

Select...

Relationship

0
0

Authors

Journals

citations
Cited by 0 publications
references
References 35 publications
0
0
0
Order By: Relevance

No citations

Set email alert for when this publication receives citations?