2017
DOI: 10.3390/computation5020026
|View full text |Cite
|
Sign up to set email alerts
|

Deep Visual Attributes vs. Hand-Crafted Audio Features on Multidomain Speech Emotion Recognition

Abstract: Emotion recognition from speech may play a crucial role in many applications related to human-computer interaction or understanding the affective state of users in certain tasks, where other modalities such as video or physiological parameters are unavailable. In general, a human's emotions may be recognized using several modalities such as analyzing facial expressions, speech, physiological parameters (e.g., electroencephalograms, electrocardiograms) etc. However, measuring of these modalities may be difficul… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
29
0

Year Published

2018
2018
2021
2021

Publication Types

Select...
6
3

Relationship

2
7

Authors

Journals

citations
Cited by 45 publications
(29 citation statements)
references
References 38 publications
0
29
0
Order By: Relevance
“…Investigating the content of past customer-firm interactions rather than simply focusing on the current interaction could capture changes in the relationship and help identify the customers who are most likely to churn. Deep learning methods have also been developed to analyze emotions based on audio content (e.g., Papakostas et al 2017), which can be used to assess a customer's mindset across a sequence of interactions so as to spot changes in how customers communicate with the firm that may signal changes in the underlying relationship.…”
Section: Challenges To Leveraging Unstructured Customer-firm Interactmentioning
confidence: 99%
“…Investigating the content of past customer-firm interactions rather than simply focusing on the current interaction could capture changes in the relationship and help identify the customers who are most likely to churn. Deep learning methods have also been developed to analyze emotions based on audio content (e.g., Papakostas et al 2017), which can be used to assess a customer's mindset across a sequence of interactions so as to spot changes in how customers communicate with the firm that may signal changes in the underlying relationship.…”
Section: Challenges To Leveraging Unstructured Customer-firm Interactmentioning
confidence: 99%
“…Our study also used raw spectrogram as the input data to our proposed model. We therefore extracted spectrogram from all the emotional audio signal in each corpora by cropping segment of 2 s length randomly using the short-time Fourier transforms at a window size short term of 40 ms and steps of 20 ms proposed by Reference [65].…”
Section: Emotion Corporamentioning
confidence: 99%
“…[9,10], or in other cases [14][15][16] follow the dimensional approach [17] that originates from psychophysiology. Spectrograms have also been previously used for other audio analysis-related tasks, such as content classification [18] and segmentation [19], for stress recognition [20] and more recently for emotion recognition with convolutional neural networks [21,22].…”
Section: Non-linguistic Approaches For Emotion Recognitionmentioning
confidence: 99%