Tobias Bocklet scite author profile

The INTERSPEECH 2012 Speaker Trait Challenge aimed at a unified test-bed for perceived speaker traits -the first challenge of this kind: personality in the five OCEAN personality dimensions, likability of speakers, and intelligibility of pathologic speakers. In the present article, we give a brief overview of the state-of-the-art in these three fields of research and describe the three sub-challenges in terms of the challenge conditions, the baseline results provided by the organisers, and a new openSMILE feature set, which has been used for computing the baselines and which has been provided to the participants. Furthermore, we summarise the approaches and the results presented by the participants to show the various techniques that are currently applied to solve these classification tasks.

show abstract

NeuroSpeech: An open-source software for Parkinson's speech analysis

Orozco-Arroyave

Vásquez-Correa

Vargas-Bonilla

et al. 2018

Digital Signal Processing

View full text Add to dashboard Cite

Age and gender recognition for telephone applications based on GMM supervectors and support vector machines

Bocklet

Maier

Bauer

et al. 2008

View full text Add to dashboard Cite

This paper compares two approaches of automatic age and gender classification with 7 classes. The first approach are Gaussian Mixture Models (GMMs) with Universal Background Models (UBMs), which is well known for the task of speaker identification/verification. The training is performed by the EM algorithm or MAP adaptation respectively. For the second approach for each speaker of the test and training set a GMM model is trained. The means of each model are extracted and concatenated, which results in a GMM supervector for each speaker. These supervectors are then used in a support vector machine (SVM). Three different kernels were employed for the SVM approach: a polynomial kernel (with different polynomials), an RBF kernel and a linear GMM distance kernel, based on the KL divergence. With the SVM approach we improved the recognition rate to 74% (p < 0.001) and are in the same range as humans.

show abstract

Towards an automatic evaluation of the dysarthria level of patients with Parkinson's disease

Vásquez-Correa

Orozco-Arroyave

Bocklet

et al. 2018

Journal of Communication Disorders

View full text Add to dashboard Cite

The m-FDA scale was introduced to assess the dysarthria level of patients with PD. Articulation features extracted from continuous speech signals to create i-vectors were the most accurate to quantify the dysarthria level, with correlations of up to 0.69 between the predicted m-FDA scores and those assigned by the phoniatricians. When the dysarthria levels were estimated considering dedicated speech exercises such as rapid repetition of syllables (DDKs) and read texts, the correlations were 0.64 and 0.57, respectively. In addition, the combination of several feature sets and speech tasks improved the results, which validates the hypothesis about the contribution of information from different tasks and feature sets when assessing dysarthric speech signals. The speaker models seem to be promising to perform individual modeling for monitoring the dysarthria level of PD patients. The proposed approach may help clinicians to make more accurate and timely decisions about the evaluation and therapy associated to the dysarthria level of patients. The proposed approach is a great step towards unobtrusive/ecological evaluations of patients with dysarthric speech without the need of attending medical appointments.

show abstract

The INTERSPEECH 2012 speaker trait challenge

et al. 2012

View full text Add to dashboard Cite

The INTERSPEECH 2012 Speaker Trait Challenge provides for the first time a unified test-bed for 'perceived' speaker traits: Personality in the five OCEAN personality dimensions, likability of speakers, and intelligibility of pathologic speakers. In this paper, we describe these three Sub-Challenges, Challenge conditions, baselines, and a new feature set by the openSMILE toolkit, provided to the participants.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Tobias Bocklet

A Survey on perceived speaker traits: Personality, likability, pathology, and the first challenge

NeuroSpeech: An open-source software for Parkinson's speech analysis

Age and gender recognition for telephone applications based on GMM supervectors and support vector machines

Towards an automatic evaluation of the dysarthria level of patients with Parkinson's disease

The INTERSPEECH 2012 speaker trait challenge

Contact Info

Product

Resources

About