Abstract:This paper presents the design, the development of a new multilingual emotional speech corpus, TaMaR-EmoDB (Tamil Malayalam Ravula-Emotion DataBase) and its evaluation using a deep neural network (DNN)-baseline system. The corpus consists of utterances from three languages, namely, Malayalam, Tamil and Ravula, a tribal language. The database consists of short speech utterances in four emotions-anger, anxiety, happiness, and sadness, along with neutral utterances. The subset of the corpus is first evaluated usi… Show more
“…This is consistent with the results of perception tests for adult emotion speech. For example, Rajan et al [13] reported comparable From Figure 2 and Table 4 we can see that experts recognize all emotions noticeably above chance (0.25 for 4-class classification). This is consistent with the results of perception tests for adult emotion speech.…”
Section: Results Of the Subjective Evaluation Of Emotional Speech Rec...mentioning
confidence: 77%
“…For example, Sowmya and Rajeswari [75] reported that they achieved an overall accuracy of 0.85 for automatic children's speech emotion recognition in the Tamil language with an SVM classifier on prosodic (energy) and spectral (MFCC) features. Rajan et al [13] reported that they achieved an Average Recall of 0.61 and Average Precision of 0.60 in the Tamil language using a DNN framework, also on prosodic and spectral features.…”
Section: Results Of Automatic Emotion Recognition On Extended Feature...mentioning
confidence: 99%
“…This is consistent with the results of perception tests for adult emotion speech. For example, Rajan et al [13] reported comparable results of subjective evaluation using a perception test with an overall accuracy of 0.846 in recognizing emotional states-happiness, anger, sadness, anxiety, and neutral-from acted emotional speech in the Tamil language.…”
Section: Results Of the Subjective Evaluation Of Emotional Speech Rec...mentioning
confidence: 99%
“…Humans are able to identify emotional states in Russian children's emotional speech above chance. Following [13], this ensures the emotional quality and naturalness of the emotions in the collected corpus of Russian children's emotional speech. 2.…”
Section: Comparison Of the Subjective Evaluation And Automatic Emotio...mentioning
confidence: 99%
“…Another severe problem is the paucity of publicly available transcribed linguistic resources for children's speech. Moreover, research on SER faces the problem that most available speech corpora differ from each other in important ways, such as methods of annotation and scenarios of interaction [11][12][13]. Inconsistencies in these methods and scenarios make it difficult to build SER systems.…”
This paper introduces the extended description of a database that contains emotional speech in the Russian language of younger school age (8–12-year-old) children and describes the results of validation of the database based on classical machine learning algorithms, such as Support Vector Machine (SVM) and Multi-Layer Perceptron (MLP). The validation is performed using standard procedures and scenarios of the validation similar to other well-known databases of children’s emotional acting speech. Performance evaluation of automatic multiclass recognition on four emotion classes “Neutral (Calm)—Joy—Sadness—Anger” shows the superiority of SVM performance and also MLP performance over the results of perceptual tests. Moreover, the results of automatic recognition on the test dataset which was used in the perceptual test are even better. These results prove that emotions in the database can be reliably recognized both by experts and automatically using classical machine learning algorithms such as SVM and MLP, which can be used as baselines for comparing emotion recognition systems based on more sophisticated modern machine learning methods and deep neural networks. The results also confirm that this database can be a valuable resource for researchers studying affective reactions in speech communication during child-computer interactions in the Russian language and can be used to develop various edutainment, health care, etc. applications.
“…This is consistent with the results of perception tests for adult emotion speech. For example, Rajan et al [13] reported comparable From Figure 2 and Table 4 we can see that experts recognize all emotions noticeably above chance (0.25 for 4-class classification). This is consistent with the results of perception tests for adult emotion speech.…”
Section: Results Of the Subjective Evaluation Of Emotional Speech Rec...mentioning
confidence: 77%
“…For example, Sowmya and Rajeswari [75] reported that they achieved an overall accuracy of 0.85 for automatic children's speech emotion recognition in the Tamil language with an SVM classifier on prosodic (energy) and spectral (MFCC) features. Rajan et al [13] reported that they achieved an Average Recall of 0.61 and Average Precision of 0.60 in the Tamil language using a DNN framework, also on prosodic and spectral features.…”
Section: Results Of Automatic Emotion Recognition On Extended Feature...mentioning
confidence: 99%
“…This is consistent with the results of perception tests for adult emotion speech. For example, Rajan et al [13] reported comparable results of subjective evaluation using a perception test with an overall accuracy of 0.846 in recognizing emotional states-happiness, anger, sadness, anxiety, and neutral-from acted emotional speech in the Tamil language.…”
Section: Results Of the Subjective Evaluation Of Emotional Speech Rec...mentioning
confidence: 99%
“…Humans are able to identify emotional states in Russian children's emotional speech above chance. Following [13], this ensures the emotional quality and naturalness of the emotions in the collected corpus of Russian children's emotional speech. 2.…”
Section: Comparison Of the Subjective Evaluation And Automatic Emotio...mentioning
confidence: 99%
“…Another severe problem is the paucity of publicly available transcribed linguistic resources for children's speech. Moreover, research on SER faces the problem that most available speech corpora differ from each other in important ways, such as methods of annotation and scenarios of interaction [11][12][13]. Inconsistencies in these methods and scenarios make it difficult to build SER systems.…”
This paper introduces the extended description of a database that contains emotional speech in the Russian language of younger school age (8–12-year-old) children and describes the results of validation of the database based on classical machine learning algorithms, such as Support Vector Machine (SVM) and Multi-Layer Perceptron (MLP). The validation is performed using standard procedures and scenarios of the validation similar to other well-known databases of children’s emotional acting speech. Performance evaluation of automatic multiclass recognition on four emotion classes “Neutral (Calm)—Joy—Sadness—Anger” shows the superiority of SVM performance and also MLP performance over the results of perceptual tests. Moreover, the results of automatic recognition on the test dataset which was used in the perceptual test are even better. These results prove that emotions in the database can be reliably recognized both by experts and automatically using classical machine learning algorithms such as SVM and MLP, which can be used as baselines for comparing emotion recognition systems based on more sophisticated modern machine learning methods and deep neural networks. The results also confirm that this database can be a valuable resource for researchers studying affective reactions in speech communication during child-computer interactions in the Russian language and can be used to develop various edutainment, health care, etc. applications.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.