Proceedings of the 21st ACM International Conference on Multimedia 2013
DOI: 10.1145/2502081.2502224
|View full text |Cite
|
Sign up to set email alerts
|

Recent developments in openSMILE, the munich open-source multimedia feature extractor

Abstract: We present recent developments in the openSMILE feature extraction toolkit. Version 2.0 now unites feature extraction paradigms from speech, music, and general sound events with basic video features for multi-modal processing. Descriptors from audio and video can be processed jointly in a single framework allowing for time synchronization of parameters, on-line incremental processing as well as off-line and batch processing, and the extraction of statistical functionals (feature summaries), such as moments, pe… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
795
0
8

Year Published

2015
2015
2024
2024

Publication Types

Select...
4
3
1

Relationship

1
7

Authors

Journals

citations
Cited by 1,247 publications
(911 citation statements)
references
References 7 publications
0
795
0
8
Order By: Relevance
“…In our experiments, we used two sets of features extracted with openSMILE [6], and a set of knowledge-oriented features known in the literature to have impact on the personality classification tasks tackled here, henceforth referred to as knowledge-based features. The Interspeech 2012 Speaker Trait ChallengePersonality Sub-challenge feature set consists of 6125 features, and has been used in the present work to provide a set of baseline results (IS2012).…”
Section: Recent Progress In Parameterizations For Paralinguisticsmentioning
confidence: 99%
“…In our experiments, we used two sets of features extracted with openSMILE [6], and a set of knowledge-oriented features known in the literature to have impact on the personality classification tasks tackled here, henceforth referred to as knowledge-based features. The Interspeech 2012 Speaker Trait ChallengePersonality Sub-challenge feature set consists of 6125 features, and has been used in the present work to provide a set of baseline results (IS2012).…”
Section: Recent Progress In Parameterizations For Paralinguisticsmentioning
confidence: 99%
“…Again, we used TUM's open-source openSMILE feature extractor (Eyben et al, 2010(Eyben et al, , 2013Eyben, 2014) and provided extracted feature sets on a per-chunk level and a configuration file to allow for additional frame-level feature extraction. The general strategy was to preserve the high-dimensional 2011 SSC feature set including energy, spectral, and voicing related low-level descriptors (LLDs); a few LLDs were added including logarithmic HNR, spectral harmonicity, and psycho-acoustic spectral sharpness, as in the AVEC 2011 set.…”
Section: Challenge Featuresmentioning
confidence: 99%
“…The latest version of openSMILE is described in Eyben et al (2013). Details can be found in the openSMILE handbook, source code, and configuration file 2 , and in Eyben (2014).…”
Section: Challenge Featuresmentioning
confidence: 99%
“…In a similar way, accumulated acoustic score is calculated as Mahalanobis distance of the input feature vector to the PDFs (b n t ) with incoming feature vector according to the input symbol n on the same transition. These accumulated scores are used together for comparing tokens to each other according to Viterbi decoding criterion (5). The result (sequence of acoustic events W ) is then defined as sequence of states Q (path through the search network) with maximum probability max Q P (O, Q|W, a ) when a set of O feature vectors is observed and an acoustic model a is used.…”
Section: Decoding Algorithmmentioning
confidence: 99%
“…This kind of system can be built using any available toolkit for processing input audio signal [5] and toolkit for classification [27] or simply using an all-in-one system, for example a speech recognition engine [12,13]. These systems, although they are capable of classification of the acoustic events if their acoustic models are provided, tend to be unnecessarily complicated containing redundant functions that slow down the whole system [20].…”
Section: Introductionmentioning
confidence: 99%