Automatic Modeling of Acoustic Perception of Breathiness in Pathological Voices

Castillo-Guerra, Eduardo; Ruiz, Arturo Rojo

doi:10.1109/tbme.2008.2007910

Cited by 19 publications

(11 citation statements)

References 22 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…These findings are in contrast to those of Castillo-Guerra and Ruíz (2009), who compared acoustic measures to perceptual judgments of breathy voice. Two measures were used to assess the ratio between H1 and H2: harmonic energy (HE) which was based on the radiated acoustic signal, and harmonic energy of residue (HE RES ), which was calculated after passage through an inverse filter model.…”

Section: Discussioncontrasting

confidence: 99%

“…Two measures were used to assess the ratio between H1 and H2: harmonic energy (HE) which was based on the radiated acoustic signal, and harmonic energy of residue (HE RES ), which was calculated after passage through an inverse filter model. The measure HE RES better accounted for variance in the perceptual judgments across multiple types of perturbation, leading the authors to conclude that filtering effects of the vocal tract from the signal improves the accuracy of the information conveyed about breathy voice (Castillo-Guerra & Ruíz, 2009). The most likely explanation for this is that they did not appear to correct the harmonic amplitudes for the influences of F1 in their calculation.…”

Section: Discussionmentioning

confidence: 99%

See 1 more Smart Citation

Relation of Structural and Vibratory Kinematics of the Vocal Folds to Two Acoustic Measures of Breathy Voice Based on Computational Modeling

Samlan

Story

2011

J Speech Lang Hear Res

View full text Add to dashboard Cite

Purpose To relate vocal fold structure and kinematics to two acoustic measures: cepstral peak prominence (CPP) and the amplitude of the first harmonic relative to the second (H1-H2). Method A computational, kinematic model of the medial surfaces of the vocal folds was used to specify features of vocal fold structure and vibration in a manner consistent with breathy voice. Four model parameters were altered: degree of vocal fold adduction, surface bulging, vibratory nodal point, and supraglottal constriction. CPP and H1-H2 were measured from simulated glottal area, glottal flow and acoustic waveforms and related to the underlying vocal fold kinematics. Results CPP decreased with increased separation of the vocal processes, whereas the nodal point location had little effect. H1-H2 increased as a function of separation of the vocal processes in the range of 1–1.5 mm and decreased with separation > 1.5 mm. Conclusions CPP is generally a function of vocal process separation. H1*-H2* will increase or decrease with vocal process separation based on vocal fold shape, pivot point for the rotational mode, and supraglottal vocal tract shape, limiting its utility as an indicator of breathy voice. Future work will relate the perception of breathiness to vocal fold kinematics and acoustic measures.

show abstract

Section: Discussioncontrasting

confidence: 99%

Section: Discussionmentioning

confidence: 99%

Relation of Structural and Vibratory Kinematics of the Vocal Folds to Two Acoustic Measures of Breathy Voice Based on Computational Modeling

Samlan

Story

2011

J Speech Lang Hear Res

View full text Add to dashboard Cite

show abstract

“…Breath feature is a vocal feature according to the mechanism of speech production, 8 which includes periodic perturbation coefficient, harmonic‐noise ratio, glottic‐noise excitation ratio, harmonic structure energy, harmonic energy residual, and harmonic‐signal ration.…”

Section: Experiments and Evaluationmentioning

confidence: 99%

“…Tone quality feature generally compose of formant frequency, frequency perturbation, and amplitude perturbation 7 . Castillo‐Guerra and Ruiz 8 studied the automatic evaluation model to the quality of acoustic breathing and proposed a tone quality feature for recognizing speech emotion, the feature consists of periodic disturbance coefficient, amplitude disturbance coefficient, harmonic noise ratio, glottic noise excitation ratio, harmonic energy, harmonic energy residue, and harmonic signal ratio. Tone quality feature has a high recognition performance in vigorous active emotions, such as anger and surprise.…”

Section: Introductionmentioning

confidence: 99%

Speech emotion recognition using emotion perception spectral feature

Jiang

Tan

Yang

et al. 2019

Concurrency and Computation

View full text Add to dashboard Cite

Summary Speech emotion recognition is an important technique for human‐computer interface applications. Due to contain rich information of emotion, the spectral feature is widely used for emotion recognition. However, the recognition performance is limited because of imprecise extracted rule and uncertain size of resolution of spectral feature. To address this issue, motivated by speech coding, we introduced psychoacoustics model, provided a perception spectral subband partition method for obtaining more precise frequency resolution. Moreover, we also provided a new spectral feature on the divided subband frequency signals. The proposed feature includes emotional perception entropy, spectral inclination, and spectral flatness. Then, a Support Vector Machine classifier is used to recognize emotion categories. The experiment results show that the proposed spectral feature is superior to the traditional MFCC feature, and also better than the state‐of‐the‐art Fourier feature and multi‐resolution amplitude feature.

show abstract

“…In this paper, traditional acoustic measures concerning the signal periodicity, harmonic components, and aspiration noise were calculated in addition to harmonic energy of residue (HE RES ), harmonic-to-signal ratio (HSR), and number of voiced frames (NVF). The best classification performance (88.5%) was achieved with a best subset regression (BSR) analysis using linear combinations of multiple parameters [15]. Wester suggested two methods to automatically classify voice quality: regression analysis and hidden markov models (HMMs).…”

Section: Introductionmentioning

confidence: 99%