2015 20th Symposium on Signal Processing, Images and Computer Vision (STSIVA) 2015
DOI: 10.1109/stsiva.2015.7330399
|View full text |Cite
|
Sign up to set email alerts
|
Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

0
5
0

Year Published

2017
2017
2023
2023

Publication Types

Select...
5
2
2

Relationship

0
9

Authors

Journals

citations
Cited by 9 publications
(5 citation statements)
references
References 10 publications
0
5
0
Order By: Relevance
“…Prior work in this area has focused on examining the effect of mixed bandwidths on automatic speech recognition (Mac et al, 2019) as well as evaluating the performance of different audio codecs on emotion recognition (Garcia et al, 2015;Siegert et al, 2016). Evaluation of the audio quality of codecs is naturally also a classic task, e.g.…”
Section: Introductionmentioning
confidence: 99%
“…Prior work in this area has focused on examining the effect of mixed bandwidths on automatic speech recognition (Mac et al, 2019) as well as evaluating the performance of different audio codecs on emotion recognition (Garcia et al, 2015;Siegert et al, 2016). Evaluation of the audio quality of codecs is naturally also a classic task, e.g.…”
Section: Introductionmentioning
confidence: 99%
“…There is also prior work in ASR that examines mixedbandwidth models that deal with data with different sampling rates [6]. There has been some related work in the emotion recognition domain; the authors in [7] and [8] show how different codecs and different bitrates affect emotion recognition accuracy. The authors in [9] analyze human emotion intelligibility as a function of the bitrates.…”
Section: Introductionmentioning
confidence: 99%
“…The expression of emotion via speech signal can be regarded as the most natural, fast and efficient communication means to tell the other party of what is inside of one's heart. There has been a large body of research in emotion recognition using physiological signals [2,3], facial images and videos [4,5] including human speech [6][7][8][9] to correlate with emotions for various applications such as in security system, classroom pedagogy, customer service call centres and job matching marketplace through phone interviews.…”
Section: Introductionmentioning
confidence: 99%
“…One of the simple selection criteria is to classify the speech frames as voiced (V) or unvoiced (UV). The performance of ERS on compressed speech was investigated in [9] using seven digital telecommunication codecs based on adaptive differential pulse code modulation (ADPCM) and Analysis-by-Synthesis. V and UV segments of speech data was classified using autocorrelation method in Praat software tool.…”
Section: Introductionmentioning
confidence: 99%