2017
DOI: 10.1007/978-3-319-66429-3_44
|View full text |Cite
|
Sign up to set email alerts
|

Improving Speech-Based Emotion Recognition by Using Psychoacoustic Modeling and Analysis-by-Synthesis

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
3
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
4
2
1

Relationship

1
6

Authors

Journals

citations
Cited by 7 publications
(4 citation statements)
references
References 17 publications
0
3
0
Order By: Relevance
“…We suppose that this outcome was caused by the underlying codec. It has been already shown for the hybrid operation mode of OPUS that certain emotions can be recognized better (Siegert et al, 2017), which is in some sense comparable to gender differences in the pronunciation of charisma. Furthermore, for men, only the differences between SPEEX and all other codecs are significant (according to Wilcoxon-Wilcox post-hoc tests with Bonferroni correction of alpha-error levels), whereas for women, also the comparisons of WAV vs. MP3 and WAV vs. OPUS came out significantly.…”
Section: Evaluation and Resultsmentioning
confidence: 77%
“…We suppose that this outcome was caused by the underlying codec. It has been already shown for the hybrid operation mode of OPUS that certain emotions can be recognized better (Siegert et al, 2017), which is in some sense comparable to gender differences in the pronunciation of charisma. Furthermore, for men, only the differences between SPEEX and all other codecs are significant (according to Wilcoxon-Wilcox post-hoc tests with Bonferroni correction of alpha-error levels), whereas for women, also the comparisons of WAV vs. MP3 and WAV vs. OPUS came out significantly.…”
Section: Evaluation and Resultsmentioning
confidence: 77%
“…The workflow followed from working with the raw data until obtaining the trained model is represented in Figure 1. [23], which are the most important features for audio management [24]. We have decided to use 22.5 kHz, a bit-depth of 16 bits and only one audio channel corresponding to monaural sound re-production.…”
Section: Methodsmentioning
confidence: 99%
“…Although a number of studies investigated the general impact of codec compression on spectral quality and acoustic features (Byrne and Foulkes, 2004;Guillemin and Watson, 2009;Siegert et al, 2016), the effects on the preservation of emotions, nonverbally conveyed ones in particular, have rarely been addressed (Albahri et al, 2016;Jokisch et al, 2016). Especially, the preservation of nonverbal emotional cues under low bandwidths is underresearched.…”
Section: Codecs and Their Influence On Prosodymentioning
confidence: 99%
“…The main purpose of applying speech compression for mobile communication is to reduce the bandwidth for transmission, the transmission delay as well as the required system memory and storage (Maruschke et al, 2016;Siegert et al, 2016). Several codecs have been developed to meet various applications with different quality requirements, aiming to retain the speech intelligibility (ITU-T, 1996(ITU-T, , 2014Maruschke et al, 2016).…”
Section: Utilized Audio Codecsmentioning
confidence: 99%