2018
DOI: 10.17743/jaes.2018.0036
|View full text |Cite
|
Sign up to set email alerts
|

Speech Emotion Recognition for Performance Interaction

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
16
0
4

Year Published

2018
2018
2024
2024

Publication Types

Select...
5
4

Relationship

2
7

Authors

Journals

citations
Cited by 54 publications
(21 citation statements)
references
References 0 publications
0
16
0
4
Order By: Relevance
“…We have used a cross-corpus, vocabulary-independent and language-independent evaluation strategy. The unknown speakers have been selected from the AESDD dataset described in [55]. The unsupervised diarization results are shown in Table 2.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…We have used a cross-corpus, vocabulary-independent and language-independent evaluation strategy. The unknown speakers have been selected from the AESDD dataset described in [55]. The unsupervised diarization results are shown in Table 2.…”
Section: Resultsmentioning
confidence: 99%
“…Such analytics, in correlation with the delivered content, provide insight for future planning. The baseline metadata scheme can also be extended to involve speech [55,56] and music [57,58] emotional cues. As it is depicted in Figure 4, the functionality that concerns different groups of interest is unified in a common framework.…”
Section: 2a Web Application For Live Radio Production and Annotationmentioning
confidence: 99%
“…Speech Emotion Recognition (SER) consists of the identification of the emotional content of speech signals, the task of recognizing human emotions and affective states from speech. In the SER field, there are three important aspects being studied and discussed in the literature: the choice of suitable acoustic features [9], the design of an appropriate classifier [10] and the generation of an emotional speech database [11][12][13]. Some works propose multimodal approaches combining visual and speech data to improve and strengthen emotion recognition systems [14,15].…”
Section: Technological Challengesmentioning
confidence: 99%
“…In this context, new audio recognition and semantic analysis techniques are deployed for General Audio Detection and Classification (GADC) tasks, which are very useful in many multidisciplinary domains [4][5][6][7][8][9][10][11][12][13][14][15][16]. Typical examples include speech recognition and perceptual enhancement [5][6][7][8], speaker indexing and diarization [14][15][16][17][18][19], voice/music detection and discrimination [1][2][3][4][9][10][11][12][13][20][21][22], information retrieval and genre classification of music [23,24], audio-driven alignment of multiple recordings [25,26], sound emotion recognition [27][28][29] and others [10,[30][31][32]. Concerning the media production and broadcasting domain, audio and audio-driven segmentation allow for the implementation of prope...…”
Section: Introductionmentioning
confidence: 99%