Low-frequency vocal modulations here designate slow disturbances of the phonatory frequency F 0. They are present in all voiced speech sounds, but their properties may be affected by neurological disease. An analysis method, based on continuous wavelet transforms, is proposed to extract the phonatory frequency trace and lowfrequency vocal modulation in sustained speech sounds. The method is used to analyze a corpus of vowels uttered by male and female speakers, some of whom are healthy and some of whom suffer from Parkinson's disease. The latter present general speech problems but their voice is not perceived as tremulous. The objective is to discover differences between speaker groups in F 0 low-frequency modulations. Results show that Parkinson's disease has different effects on the voice of male and female speakers. The average phonatory frequency is significantly higher for male parkinsonian speakers. The modulation amplitude is significantly higher for female parkinsonian speakers. The modulation frequency is significantly higher and the ratio between the modulation energies in the frequency-bands [3Hz, 7Hz] and [7Hz, 15Hz] is significantly lower for parkinsonian speakers of both genders.
This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
ACCEPTED MANUSCRIPT
AbstractThe objective is to analyze vocal dysperiodicities in connected speech produced by dysphonic speakers. The analysis involves a variogram-based method that enables tracking instantaneous vocal dysperiodicities. The dysperiodicity trace is summarized by means of the signal-to-dysperiodicity ratio, which has been shown to correlate strongly with the perceived degree of hoarseness of the speaker. Previously, this method has been evaluated on small corpora only. In this article, analyses have been carried out on two corpora comprising over 250 and 700 speakers. This has enabled carrying out multifrequency band and multi-cue analyses without risking over-fitting. The analysis results are compared to the cepstral peak prominence, which is a popular cue that indirectly summarizes vocal dysperiodicities frame-wise. A perceptual rating has been available for ACCEPTED MANUSCRIPT
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.