Ten Recent Trends in Computational Paralinguistics

Schuller, Björn; Weninger, Felix

doi:10.1007/978-3-642-34584-5_3

Cited by 10 publications

(6 citation statements)

References 90 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In our experiments, we used such evaluation metrics as the per class Accuracy, Precision, Recall, and F1-score. Due to the unequal number of samples in each test class (unequal priors), we have analyzed the results using Unweighted Average Recall (UAR) for multiclass classifiers, closely related to the accuracy as a good or even better metric to optimize when the sample class ratio is imbalanced [72]. UAR is defined as the average across the diagonal of the confusion matrix.…”

Section: Evaluation Setupmentioning

confidence: 99%

See 1 more Smart Citation

Automatic Speech Emotion Recognition of Younger School Age Children

Matveev¹,

Matveev²,

Frolova³

et al. 2022

Mathematics

View full text Add to dashboard Cite

This paper introduces the extended description of a database that contains emotional speech in the Russian language of younger school age (8–12-year-old) children and describes the results of validation of the database based on classical machine learning algorithms, such as Support Vector Machine (SVM) and Multi-Layer Perceptron (MLP). The validation is performed using standard procedures and scenarios of the validation similar to other well-known databases of children’s emotional acting speech. Performance evaluation of automatic multiclass recognition on four emotion classes “Neutral (Calm)—Joy—Sadness—Anger” shows the superiority of SVM performance and also MLP performance over the results of perceptual tests. Moreover, the results of automatic recognition on the test dataset which was used in the perceptual test are even better. These results prove that emotions in the database can be reliably recognized both by experts and automatically using classical machine learning algorithms such as SVM and MLP, which can be used as baselines for comparing emotion recognition systems based on more sophisticated modern machine learning methods and deep neural networks. The results also confirm that this database can be a valuable resource for researchers studying affective reactions in speech communication during child-computer interactions in the Russian language and can be used to develop various edutainment, health care, etc. applications.

show abstract

Section: Evaluation Setupmentioning

confidence: 99%

“…The test set for automatic evaluation consists of 33 separate samples of acting emotional speech of Russian children used in perception tests [21]. Both classifiers were trained based on the eGeMAPS feature set [72].…”

Section: Comparison Of the Subjective Evaluation And Automatic Emotio...mentioning

confidence: 99%

Automatic Speech Emotion Recognition of Younger School Age Children

Matveev¹,

Matveev²,

Frolova³

et al. 2022

Mathematics

View full text Add to dashboard Cite

show abstract

“…In such cases the quantization into a few categorical labels might lead to a loss in model representativeness [7]. In comparison with the categorical problem, only a few publications have addressed the dimensional recognition challenges, yet it has become a trend in the affective computing community [7], [40], [46], [47], [48], [49]. Some works approximated dimensional affect indicators with fine-grained quantization scales on segmented data, as in [42].…”

Section: Related Workmentioning

confidence: 99%

Dimensional Affect Recognition from HRV: An Approach Based on Supervised SOM and ELM

Bugnon

Calvo

Milone

2020

IEEE Trans. Affective Comput.

View full text Add to dashboard Cite

Dimensional affect recognition is a challenging topic and current techniques do not yet provide the accuracy necessary for HCI applications. In this work we propose two new methods. The first is a novel self-organizing model that learns from similarity between features and affects. This method produces a graphical representation of the multidimensional data which may assist the expert analysis. The second method uses extreme learning machines, an emerging artificial neural network model. Aiming for minimum intrusiveness, we use only the heart rate variability, which can be recorded using a small set of sensors. The methods were validated with two datasets. The first is composed of 16 sessions with different participants and was used to evaluate the models in a classification task. The second one was the publicly available Remote Collaborative and Affective Interaction (RECOLA) dataset, which was used for dimensional affect estimation. The performance evaluation used the kappa score, unweighted average recall and the concordance correlation coefficient. The concordance coefficient on the RECOLA test partition was 0.421 in arousal and 0.321 in valence. Results shows that our models outperform state-of-the-art models on the same data and provides new ways to analyze affective states.

show abstract

“…Sensing affect related states, including interest, confusion, or frustration, and adapting behavior accordingly, is one of the key capabilities of humans; consequently, simulating such abilities in technical systems through signal processing and machine learning techniques is believed to improve human-computer interaction in general (Schuller & Weninger, 2012) and computer based learning in particular (Aist, Kort, Reilly, Mostow, & Picard, 2002;Forbes-Riley & Litman, 2010). Important abilities of affective tutors or lecturers, besides emotional expressivity (Huang, Kuo, Chang, & Heh, 2004), include the choice of appropriate wording, which has been found to be highly important in computer based tutoring to support the learning outcome (Narciss & Huth, 2004).…”

Section: Introductionmentioning

confidence: 98%

Words that Fascinate the Listener

Weninger

Staudt

Schuller

2013

International Journal of Distance Education Technologies

Self Cite

View full text Add to dashboard Cite

In a large scale study on 843 transcripts of Technology, Entertainment and Design (TED) talks, the authors address the relation between word usage and categorical affective ratings of lectures by a large group of internet users. Users rated the lectures by assigning one or more predefined tags which relate to the affective state evoked in the audience (e. g., 'fascinating', 'funny', 'courageous', 'unconvincing' or 'long-winded'). By automatic classification experiments, they demonstrate the usefulness of linguistic features for predicting these subjective ratings. Extensive test runs are conducted to assess the influence of the classifier and feature selection, and individual linguistic features are evaluated with respect to their discriminative power. In the result, classification whether the frequency of a given tag is higher than on average can be performed most robustly for tags associated with positive valence, reaching up to 80.7% accuracy on unseen test data.

show abstract

Ten Recent Trends in Computational Paralinguistics

Cited by 10 publications

References 90 publications

Automatic Speech Emotion Recognition of Younger School Age Children

Automatic Speech Emotion Recognition of Younger School Age Children

Dimensional Affect Recognition from HRV: An Approach Based on Supervised SOM and ELM

Words that Fascinate the Listener

Contact Info

Product

Resources

About