2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2017
DOI: 10.1109/icassp.2017.7953132
|View full text |Cite
|
Sign up to set email alerts
|

Effective emotion recognition in movie audio tracks

Abstract: This paper addresses the problem of speech emotion recognition from movie audio tracks. The recently collected Acted Facial Expression in the Wild 5.0 database is used. The aim is to discriminate among angry, happy, and neutral. We extract a relatively small number of features, a subset of which is not commonly used for the emotion recognition task. Those features are fed as input to an ensemble classifier that combines random forests with support vector machines. An accuracy of 65.63% is reported, outperformi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
5
0

Year Published

2018
2018
2021
2021

Publication Types

Select...
4
4
1

Relationship

0
9

Authors

Journals

citations
Cited by 10 publications
(5 citation statements)
references
References 26 publications
0
5
0
Order By: Relevance
“…Audio-based emotion recognition exploits features such as pitch, intensity, energy, and MFCCs (Mel-frequency cepstrum coefficients). Challenges in speech-based emotion recognition are that expression differs among subjects [9] and is directly influenced by age, culture, and externals factors such as the environment [10]. Video-based systems extract emotions from features such as facial expression, mouth, or eye shape [4].…”
Section: Related Workmentioning
confidence: 99%
“…Audio-based emotion recognition exploits features such as pitch, intensity, energy, and MFCCs (Mel-frequency cepstrum coefficients). Challenges in speech-based emotion recognition are that expression differs among subjects [9] and is directly influenced by age, culture, and externals factors such as the environment [10]. Video-based systems extract emotions from features such as facial expression, mouth, or eye shape [4].…”
Section: Related Workmentioning
confidence: 99%
“…the frame size is size change 1ms effect noticeable changes in speech emotion recognition. [13] Focus on the emotion to differ between the angry, happy, and neutral. we extract the feature and subset which is not commonly used in the emotion recognition task.…”
Section: Audio Analysis Andmentioning
confidence: 99%
“…The random forest has predicted the correct emotion and give the highest accuracy of 81.5%. Margarita Kotti [13] in 2017 The author introduces a method to recognize the emotion from movies and drama clips.focus on the emotion to differ between the angry, happy, and neutral. we extract the feature and subset which is not mostly used in the emotion recognition task.…”
Section: Audio Analysis Andmentioning
confidence: 99%