Abstract:SUBESCO is an audio-only emotional speech corpus for Bangla language. The total duration of the corpus is in excess of 7 hours containing 7000 utterances, and it is the largest emotional speech corpus available for this language. Twenty native speakers participated in the gender-balanced set, each recording of 10 sentences simulating seven targeted emotions. Fifty university students participated in the evaluation of this corpus. Each audio clip of this corpus, except those of Disgust emotion, was validated fo… Show more
“…To a degree, the neutral emotional class is mistaken for the sad class. Note that, similar confusion has been seen in prior studies too [ 2 , 11 ]. Recognition rates are highest for angry followed by neutral, happy, sad, and surprise emotions.…”
Section: Experimental Design Materials and Methodssupporting
confidence: 82%
“…There are several SER datasets available in literature covering English, Greek, Korean, and German languages, such as IEMOCAP [5] , RAVDESS [6] , MSP-IMPROV [2] , AESDD [7] , SAVEE [8] , CADKES [9] , and EMO-DB [10] . However, there is only one dataset for the SER task in the Bangla language, namely SUBESCO [11] . A simple comparison of these public SER datasets with BanglaSER is shown in Table 1 .…”
“…To a degree, the neutral emotional class is mistaken for the sad class. Note that, similar confusion has been seen in prior studies too [ 2 , 11 ]. Recognition rates are highest for angry followed by neutral, happy, sad, and surprise emotions.…”
Section: Experimental Design Materials and Methodssupporting
confidence: 82%
“…There are several SER datasets available in literature covering English, Greek, Korean, and German languages, such as IEMOCAP [5] , RAVDESS [6] , MSP-IMPROV [2] , AESDD [7] , SAVEE [8] , CADKES [9] , and EMO-DB [10] . However, there is only one dataset for the SER task in the Bangla language, namely SUBESCO [11] . A simple comparison of these public SER datasets with BanglaSER is shown in Table 1 .…”
“…Most recently, another database, whose verification is based on perception tests and statistical analyses, came from the Bangla language. The SUST Bangla Emotional Speech Corpus (SUBESCO) [16] involves 20 actors portraying 7 emotions.…”
Section: Existing Databasesmentioning
confidence: 99%
“…However, the relationship was not significant (p = . 16). The identification accuracy-naturalness relationship is captured in Fig.…”
Section: Identification Accuracy and Naturalness Relationshipmentioning
confidence: 99%
“…In the study of speech emotion recognition, validated emotional speech databases constitute a crucial building block for developing and evaluating speech emotion recognizers [13]. To date, numerous emotional speech databases have been created in many languages, including Arabic [14], [15], Bangla [16], Mandarin Chinese [17], [18], Danish [19], English [20]- [22], German [23], [24], Italian [25], and Persian [26]. However, the majority of the databases come from highresource languages such as English, Mandarin Chinese, and German [27].…”
A growing body of evidence indicates that intensity plays a role in emotion perception. However, only a few databases have been explicitly designed to provide emotional stimuli that are expressed at varying intensities. We developed and validated a Korean audio-only database of emotional expressions. Eighteen actors were recorded using twenty-five sentences with strong and moderate intensities for "neutral," "happiness," "sadness," "anger," "fear," and "boredom" emotions. Twenty-five native Korean-speaking adults completed the emotion identification and naturalness rating tasks. All listeners were presented with the full set of 5400 recordings in a six-alternative forced-choice paradigm, yielding 135000 judgements for identification and naturalness, respectively. Raw and unbiased hit rates were calculated, with identification responses significantly above chance level for every emotion at both intensities. The overall raw hit rates reached 87% and 78% for the strong and moderate stimuli, respectively, indicating that strong emotional expressions were more accurately identified than their moderate counterparts. Similarly, a recognition advantage for strong intensity over moderate intensity was observed for each emotion at both intensities. High inter-and intra-rater reliabilities were found in listeners' identifying emotion categories and assigning naturalness ratings, respectively. Further, there was a strong association between identification accuracy and the degree of naturalness; more natural variants of an emotion were more accurately identified than its less natural counterparts. These results confirm that the proposed database will serve as a valuable source for emotion research.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.