2018 13th International Workshop on Semantic and Social Media Adaptation and Personalization (SMAP) 2018
DOI: 10.1109/smap.2018.8501881
|View full text |Cite
|
Sign up to set email alerts
|

Speech Emotion Recognition Adapted to Multimodal Semantic Repositories

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
10
0
6

Year Published

2018
2018
2024
2024

Publication Types

Select...
7
3

Relationship

0
10

Authors

Journals

citations
Cited by 17 publications
(16 citation statements)
references
References 11 publications
0
10
0
6
Order By: Relevance
“…Finally, a question arises whether a ground truth database may be formulated containing emotionally "loaded" utterances, utilizing such techniques as, e.g., crowdsourcing [57] applied for both "producing" emotions in speech as well as evaluating gathered utterances. Such an experiment may result in more reliable datasets for the in-depth training process.…”
Section: Discussionmentioning
confidence: 99%
“…Finally, a question arises whether a ground truth database may be formulated containing emotionally "loaded" utterances, utilizing such techniques as, e.g., crowdsourcing [57] applied for both "producing" emotions in speech as well as evaluating gathered utterances. Such an experiment may result in more reliable datasets for the in-depth training process.…”
Section: Discussionmentioning
confidence: 99%
“…The current methods of analyzing audio data flow are not perfect either. Speech recognition technologies are at a high level, and the current results help us analyze the semantic part of speech [13]. However, the intonation components have not yet been covered properly.…”
Section: Methodsmentioning
confidence: 99%
“…In the SER field, there are three important aspects being studied and discussed in the literature: the choice of suitable acoustic features [9], the design of an appropriate classifier [10] and the generation of an emotional speech database [11][12][13]. Some works propose multimodal approaches combining visual and speech data to improve and strengthen emotion recognition systems [14,15]. It is also well attested that speech recognition systems function less efficiently when the speaker is in an emotional state [16].…”
Section: Technological Challengesmentioning
confidence: 99%