Robust Unsupervised Arousal Rating:A Rule-Based Framework withKnowledge-Inspired Vocal Features

Bone, Daniel; Lee, Chi-Chun; Narayanan, Shrikanth

doi:10.1109/taffc.2014.2326393

Cited by 61 publications

(59 citation statements)

References 41 publications

(48 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Further, speech and articulation rate was found to be important for all emotional expressions. For the case of automatic arousal recognition, [22] successfully builds an unsupervised recognition framework with these descriptors. [16] perform acoustic analysis of various fundamental frequency and harmonics related parameters 1. http://www.speech.kth.se/wavesurfer/ on a small set of emotional speech utterances.…”

Section: Related Workmentioning

confidence: 99%

The Geneva Minimalistic Acoustic Parameter Set (GeMAPS) for Voice Research and Affective Computing

Eyben

Scherer

Schuller

et al. 2016

IEEE Trans. Affective Comput.

1,214

835

View full text Add to dashboard Cite

Work on voice sciences over recent decades has led to a proliferation of acoustic parameters that are used quite selectively and are not always extracted in a similar fashion. With many independent teams working in different research areas, shared standards become an essential safeguard to ensure compliance with state-of-the-art methods allowing appropriate comparison of results across studies and potential integration and combination of extraction and recognition systems. In this paper we propose a basic standard acoustic parameter set for various areas of automatic voice analysis, such as paralinguistic or clinical speech analysis. In contrast to a large brute-force parameter set, we present a minimalistic set of voice parameters here. These were selected based on a) their potential to index affective physiological changes in voice production, b) their proven value in former studies as well as their automatic extractability, and c) their theoretical significance. The set is intended to provide a common baseline for evaluation of future research and eliminate differences caused by varying parameter sets or even different implementations of the same parameters. Our implementation is publicly available with the openSMILE toolkit. Comparative evaluations of the proposed feature set and large baseline feature sets of INTERSPEECH challenges show a high performance of the proposed set in relation to its size.

show abstract

Section: Related Workmentioning

confidence: 99%

The Geneva Minimalistic Acoustic Parameter Set (GeMAPS) for Voice Research and Affective Computing

Eyben

Scherer

Schuller

et al. 2016

IEEE Trans. Affective Comput.

1,214

835

View full text Add to dashboard Cite

show abstract

“…Predictions on the test partition were submitted with the linear (% U AR = 32.16) and the gaussian (% U AR = 31.61) kernels, and we obtained again a better performance than the baseline (% U AR = 23.82); absolute improvement with the linear kernel is 8.34%. Therefore, a smaller, expert-knowledge based acoustic feature set shows higher robustness for emotion recognition than a large scale brute-force feature set, as found in [3]. …”

Section: Audio Featuresmentioning

confidence: 91%

“…In contrast to large scale brute-force feature sets, which have been successfully applied to many speech and music classification tasks, e. g., [20,23], smaller, expert-knowledge based feature sets have shown high robustness for emotion recognition [3]. In this light, we assembled a small acoustic feature set for the EmotiW14 Challenge, using our openS-MILE toolkit [12].…”

Section: Audio Featuresmentioning

confidence: 99%

Emotion Recognition in the Wild

Ringeval

Amiriparian

Eyben

et al. 2014

Proceedings of the 16th International Conference on Multimodal Interaction

View full text Add to dashboard Cite

“…Given the advantages of relative emotions over absolute emotions, one wonders whether changes in emotion ratings can be better predicted than absolute ratings. Secondly, despite the increasing popularity of predicting emotion dimensions either at utterance level Bone et al, 2014) or at frame level Metallinou et al, 2011;Nicolaou et al, 2011), all of the studies focus on prediction of absolute emotions across time. From these studies, it seems that predicting absolute emotion dimension remains challenging, and predicting absolute emotion alone may not provide insight into dynamic components, properties and regularities of emotion changes.…”

Section: Related Workmentioning

confidence: 99%

Prediction of Emotion Change From Speech

Huang

Epps

2018

Front. ICT

View full text Add to dashboard Cite

The fact that emotions are dynamic in nature and evolve across time has been explored relatively less often in automatic emotion recognition systems to date. Although withinutterance information about emotion changes recently has received some attention, there remain open questions unresolved, such as how to approach delta emotion ground truth, how to predict the extent of emotion change from speech, and how well change can be predicted relative to absolute emotion ratings. In this article, we investigate speech-based automatic systems for continuous prediction of the extent of emotion changes in arousal/valence. We propose the use of regression (smoothed) deltas as ground truth for emotion change, which yielded considerably higher inter-rater reliability than first-order deltas, a commonly used approach in previous research, and represent a more appropriate approach to derive annotations for emotion change research, findings which are applicable beyond speech-based systems. In addition, the first system design for continuous emotion change prediction from speech is explored. Experimental results under the Output-Associative Relevance Vector Machine framework interestingly show that changes in emotion ratings may be better predicted than absolute emotion ratings on the RECOLA database, achieving 0.74 vs. 0.71 for arousal and 0.41 vs. 0.37 for valence in concordance correlation coefficients. However, further work is needed to achieve effective emotion change prediction performances on the SEMAINE database, due to the large number of non-change frames in the absolute emotion ratings.

show abstract

Robust Unsupervised Arousal Rating:A Rule-Based Framework withKnowledge-Inspired Vocal Features

Cited by 61 publications

References 41 publications

The Geneva Minimalistic Acoustic Parameter Set (GeMAPS) for Voice Research and Affective Computing

The Geneva Minimalistic Acoustic Parameter Set (GeMAPS) for Voice Research and Affective Computing

Emotion Recognition in the Wild

Prediction of Emotion Change From Speech

Contact Info

Product

Resources

About