Recent developments in openSMILE, the munich open-source multimedia feature extractor

Eyben, Florian; Weninger, Felix; Groß, Florian; Schuller, Björn

doi:10.1145/2502081.2502224

Cited by 1,247 publications

(911 citation statements)

References 7 publications

Supporting

Mentioning

795

Contrasting

Unclassified

Order By: Relevance

“…In our experiments, we used two sets of features extracted with openSMILE [6], and a set of knowledge-oriented features known in the literature to have impact on the personality classification tasks tackled here, henceforth referred to as knowledge-based features. The Interspeech 2012 Speaker Trait ChallengePersonality Sub-challenge feature set consists of 6125 features, and has been used in the present work to provide a set of baseline results (IS2012).…”

Section: Recent Progress In Parameterizations For Paralinguisticsmentioning

confidence: 99%

Acoustic-Prosodic Automatic Personality Trait Assessment for Adults and Children

Solera-Ureña

Moniz

Batista

et al. 2016

Advances in Speech and Language Technologies for Iberian Languages

View full text Add to dashboard Cite

Abstract. This paper investigates the use of heterogeneous speech corpora for automatic assessment of personality traits in terms of the BigFive OCEAN dimensions. The motivation for this work is twofold: the need to develop methods to overcome the lack of children's speech corpora, particularly severe when targeting personality traits, and the interest on cross-age comparisons of acoustic-prosodic features to build robust paralinguistic detectors. For this purpose, we devise an experimental setup with age mismatch utilizing the Interspeech 2012 Personality Subchallenge, containing adult speech, as training data. As test data, we use a corpus of children's European Portuguese speech. We investigate various features sets such as the Sub-challenge baseline features, the recently introduced eGeMAPS features and our own knowledge-based features. The preliminary results bring insights into cross-age and -language detection of personality traits in spontaneous speech, pointing out to a stable set of acoustic-prosodic features for Extraversion and Agreeableness in both adult and child speech.

show abstract

Section: Recent Progress In Parameterizations For Paralinguisticsmentioning

confidence: 99%

Acoustic-Prosodic Automatic Personality Trait Assessment for Adults and Children

Solera-Ureña

Moniz

Batista

et al. 2016

Advances in Speech and Language Technologies for Iberian Languages

View full text Add to dashboard Cite

show abstract

“…Again, we used TUM's open-source openSMILE feature extractor (Eyben et al, 2010(Eyben et al, , 2013Eyben, 2014) and provided extracted feature sets on a per-chunk level and a configuration file to allow for additional frame-level feature extraction. The general strategy was to preserve the high-dimensional 2011 SSC feature set including energy, spectral, and voicing related low-level descriptors (LLDs); a few LLDs were added including logarithmic HNR, spectral harmonicity, and psycho-acoustic spectral sharpness, as in the AVEC 2011 set.…”

Section: Challenge Featuresmentioning

confidence: 99%

“…The latest version of openSMILE is described in Eyben et al (2013). Details can be found in the openSMILE handbook, source code, and configuration file 2 , and in Eyben (2014).…”

Section: Challenge Featuresmentioning

confidence: 99%

A Survey on perceived speaker traits: Personality, likability, pathology, and the first challenge

Schuller

Steidl

Batliner

et al. 2015

Computer Speech & Language

Self Cite

View full text Add to dashboard Cite

The INTERSPEECH 2012 Speaker Trait Challenge aimed at a unified test-bed for perceived speaker traits -the first challenge of this kind: personality in the five OCEAN personality dimensions, likability of speakers, and intelligibility of pathologic speakers. In the present article, we give a brief overview of the state-of-the-art in these three fields of research and describe the three sub-challenges in terms of the challenge conditions, the baseline results provided by the organisers, and a new openSMILE feature set, which has been used for computing the baselines and which has been provided to the participants. Furthermore, we summarise the approaches and the results presented by the participants to show the various techniques that are currently applied to solve these classification tasks.

show abstract

“…In a similar way, accumulated acoustic score is calculated as Mahalanobis distance of the input feature vector to the PDFs (b n t ) with incoming feature vector according to the input symbol n on the same transition. These accumulated scores are used together for comparing tokens to each other according to Viterbi decoding criterion (5). The result (sequence of acoustic events W ) is then defined as sequence of states Q (path through the search network) with maximum probability max Q P (O, Q|W, a ) when a set of O feature vectors is observed and an acoustic model a is used.…”

Section: Decoding Algorithmmentioning

confidence: 99%

“…This kind of system can be built using any available toolkit for processing input audio signal [5] and toolkit for classification [27] or simply using an all-in-one system, for example a speech recognition engine [12,13]. These systems, although they are capable of classification of the acoustic events if their acoustic models are provided, tend to be unnecessarily complicated containing redundant functions that slow down the whole system [20].…”

Section: Introductionmentioning

confidence: 99%

Efficient acoustic detector of gunshots and glass breaking

Lojka

Pleva

Kiktová

et al. 2015

Multimed Tools Appl

View full text Add to dashboard Cite

An efficient acoustic events detection system EAR-TUKE is presented in this paper. The system is capable of processing continuous input audio stream in order to detect potentially dangerous acoustic events, specifically gunshots or breaking glass. The system is programmed entirely in C++ language (core math. functions in C) and was designed to be self sufficient without requiring additional dependencies. In the design and development process the main focus was put on easy support of new acoustic events detection, low memory profile, low computational requirements to operate on devices with low resources, and on long-term operation and continuous input stream monitoring without any maintenance. In order to satisfy these requirements on the system, EAR-TUKE is based on a custom approach to detection and classification of acoustic events. The system is using acoustic models of events based on Hidden Markov Models (HMMs) and a modified Viterbi decoding process with an additional module to allow continuous monitoring. Cepstral Mean Normalization (CMN) and our proposed removal of basic coefficients from feature vectors to increase robustness. This paper also presents the development process and results evaluating the final design of the system.

show abstract

Recent developments in openSMILE, the munich open-source multimedia feature extractor

Cited by 1,247 publications

References 7 publications

Acoustic-Prosodic Automatic Personality Trait Assessment for Adults and Children

Acoustic-Prosodic Automatic Personality Trait Assessment for Adults and Children

A Survey on perceived speaker traits: Personality, likability, pathology, and the first challenge

Efficient acoustic detector of gunshots and glass breaking

Contact Info

Product

Resources

About