This paper addresses the problem of automatic identification of vowels uttered in isolation by female and child speakers. In this case, the magnitude spectrum of voiced vowels is sparsely sampled since only frequencies at integer multiples of F0 are significant. This impacts negatively on the performance of vowel identification techniques that either ignore pitch or rely on global shape models. A new pitch-dependent approach to vowel identification is proposed that emerges from the concept of timbre and that defines perceptual spectral clusters (PSC) of harmonic partials. A representative set of static PSC-related features are estimated and their performance is evaluated in automatic classification tests using the Mahalanobis distance. Linear prediction features and Mel-frequency cepstral coefficients (MFCC) coefficients are used as a reference and a database of five (Portuguese) natural vowel sounds uttered by 44 speakers (including 27 child speakers) is used for training and testing the Gaussian models. Results indicate that perceptual spectral cluster (PSC) features perform better than plain linear prediction features, but perform slightly worse than MFCC features. However, PSC features have the potential to take full advantage of the pitch structure of voiced vowels, namely in the analysis of concurrent voices, or by using pitch as a normalization parameter.
The goal of this study was to analyse perceptually and acoustically the voices of patients with Unilateral Vocal Fold Paralysis (UVFP) and compare them to the voices of normal subjects. These voices were analysed perceptually with the GRBAS scale and acoustically using the following parameters: mean fundamental frequency (F0), standard-deviation of F0, jitter (ppq5), shimmer (apq11), mean harmonics-to-noise ratio (HNR), mean first (F1) and second (F2) formants frequency, and standard-deviation of F1 and F2 frequencies. Statistically significant differences were found in all of the perceptual parameters. Also the jitter, shimmer, HNR, standard-deviation of F0, and standard-deviation of the frequency of F2 were statistically different between groups, for both genders. In the male data differences were also found in F1 and F2 frequencies values and in the standard-deviation of the frequency of F1. This study allowed the documentation of the alterations resulting from UVFP and addressed the exploration of parameters with limited information for this pathology.
The incidence of chronic diseases is increasing and monitoring patients in a home environment is recommended. Noncompliance with prescribed medication regimens is a concern, especially among older people. Heart failure is a chronic disease that requires patients to follow strict medication plans permanently. With the objective of helping these patients managing information about their medicines and increasing adherence, the personal medication advisor CARMIE was developed as a conversational agent capable of interacting, in Portuguese, with users through spoken natural language. The system architecture is based on a language parser, a dialog manager, and a language generator, integrated with already existing tools for speech recognition and synthesis. All modules work together and interact with the user through an Android application, supporting users to manage information about their prescribed medicines. The authors also present a preliminary usability study and further considerations on CARMIE.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.