This study investigated the relationship between acoustic spectral/cepstral measures and listener severity ratings in normal and disordered voice samples. CAPE-V sentence samples and the vowel /a/were elicited from eight normal speakers and 24 patients with varying degrees of dysphonia severity. Samples were analysed for measures of the cepstral peak prominence (CPP), the ratio of low-to-high spectral energy, and their respective standard deviations. Perceptual ratings of overall severity were also obtained for all samples. Results showed that all acoustic variables combined in a four-factor model which correlated with perceived severity with R = 0.81 (R(2) = 0.65). For the vowel /a/, a five-factor model incorporating all acoustic variables and gender correlated with perceived severity with R = 0.96 (R(2) = 0.91). Results indicate that a strong relationship between perceptual and acoustic estimates of dysphonia severity can be achieved in both continuous speech and vowel contexts using a model incorporating spectral/cepstral measures.
The recommended protocols for instrumental assessment of voice using laryngeal endoscopic imaging, acoustic, and aerodynamic methods will enable clinicians and researchers to collect a uniform set of valid and reliable measures that can be compared across assessments, clients, and facilities.
The purpose of the study was to identify a sub-set of spectral/cepstral-based analysis methods that would most effectively predict dysphonia severity (as estimated via auditory-perceptual analysis) in samples of continuous speech. Acoustic estimates of dysphonia severity were used as an objective treatment outcomes measure in a set of pre- vs post-treatment speech samples. Pre- and post-treatment continuous speech samples from 104 females with primary muscle tension dysphonia (MTD) were rated by listeners using a 100 point visual analogue scale (VAS) and analysed acoustically with spectral/cepstral-based measures. Stepwise linear regression produced a three-factor model consisting of the cepstral peak prominence (CPP); the mean ratio of low-to-high frequency spectral energy; and the standard deviation of the ratio of low-to-high frequency spectral energy that was strongly correlated with perceived dysphonia severity ratings (R = .85; R2 = .73). Mean differences between predicted vs perceptual ratings for pre- and post-treatment speech samples were < 6 points on the 100 point VAS; mean absolute differences between predicted and perceived ratings were < 16 points on the 100 point VAS (equivalent to within one scale value on commonly used 7-point equal-appearing interval rating scales). A multi-parameter acoustic model consisting of spectral/cepstral-based measures shows considerable promise as an objective measure of dysphonia severity in continuous speech, even across the diverse voice types and severities observed in pre- and post-treatment MTD speech samples.
The purpose of this study was to extend understanding of the effects of aging on the female voice by obtaining measures of both acoustic and respiratory-based performance in groups of 18-30, 40-49, 50-59, 60-69, and 70-79-year-old subjects. Acoustic measures of speaking fundamental frequency (SFF), pitch sigma, jitter, shimmer, and signal-to-noise ratio, as well as respiratory-based measures of vital capacity (VC), maximum phonation time (MPT), and phonation quotient (PQ) were obtained. Results indicated that the aging groups differed significantly in terms of SFF, pitch sigma, MPT, and VC. In addition, discriminant function analysis was used to classify subjects into age group via a three-variable model consisting of VC, SFF, and pitch sigma (84% accuracy), and into pre- vs. post-menopausal status via a two-variable model consisting of VC and pitch sigma (92% accuracy). It appears that declinations in the respiratory and laryngeal mechanisms may occur simultaneously in the aging female.
During assessment and management of individuals with voice disorders, clinicians routinely attempt to describe or quantify the severity of a patient's dysphonia. This investigation used acoustic measures derived from sustained vowel samples to predict dysphonia severity (as determined by auditory-perceptual ratings), for a diverse set of voice samples obtained from 134 adult females, with and without voice disorders. Stepwise multiple regression analysis on all voice samples, followed by randomized and repeated cross-validation (random selection of 75% of the original 134 voice sample corpus; 100 iterations) indicated that a four-variable model comprised of time and spectral-based acoustic measures was able to strongly predict perceived severity of dysphonia (mean R = .880; mean R(2) = .775). A cepstral-based measure (CPP/EXP ratio) was determined to be the most significant contributor to the prediction of dysphonia severity, though it is clear that the addition of other acoustic measures (pitch sigma; shimmer (dB); and the Discrete Fourier Transformation ratio, a measure of low versus high frequency spectral energy) add substantially to the accurate prediction of severity. The results are interpreted and discussed with respect to the key acoustic characteristics that contributed to the prediction of severity, the value of identifying a subset of time and spectral-based acoustic measures which appear sensitive to a perceptually diverse set of voices, and the possible use of acoustic models in guiding auditory-perceptual ratings.
In a sample of dysphonic speakers (hypofunctional etiologies) versus typical speakers, spectral/cepstral measures of CPP and L/H ratio were able to differentiate these groups from one another in both vowel prolongation and continuous speech contexts with high sensitivity and specificity. The results of this study support the growing body of literature documenting the significant value of cepstral and other spectral-based acoustic measures to the clinical evaluation and management processes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.