Previous research showed that, in singing, vowel qualities of isolated vowel sounds can be discriminated up to a fundamental frequency (F0) of about 500 Hz. However, indications are reported in literature for vowel discrimination on F0 > 500 Hz for singing (raised larynx condition, CVC context) as well as for speech-like sounds. In this study, we tested vowel discrimination at a high F0 in speech using minimal pairs build from eight long German vowels. Words were produced in speech mode at F0 of about 650 Hz by two female speakers. For all samples except the words including /a/ and /ɛ/, F0 exceeded F1 values as given in vowel statistics for Standard German. In a listening test, stimuli were played back in random order to 14 listeners (7f, 7m) for identification. The results showed that vowel discrimination can be preserved at such high fundamental frequencies. This could mean that, for our speakers and the high fundamental frequency examined, (1) source-filter-characteristics were effective up to 650 Hz, or (2) transitions played a crucial role, or (3) other spectral characteristics than formants have to be taken into account in order to explain these results.
There is a broad consensus in the literature that vowel-specific formant patterns differ as a function of gender (men/women) or age (adults/children) due to different average vocal tract sizes. Although an additional influence of fundamental frequency F0 is discussed in corresponding normalization approaches, formant patterns relating to sounds of adults and children that exhibit the same F0, to sounds of adults with higher F0 than sounds of children, and to sounds of men with higher F0 than sounds of women are barely compared. Investigating vowels of men, women, and children producing sounds with varying F0, we observed (1) a possible decrease or even a disappearance of the expected speaker-group differences in the formant frequencies < 1.5 kHz if F0 of the utterances correspond for children, women, and men, and (2) a possible “inversion“ of the expected speaker-group differences < 1.5 kHz if F0 of the utterances of adults are higher than those of children, or F0 of men are higher than those of women. However, no corresponding relationship between F0 and the higher formants > 1.5 kHz was found. These observations call for a further examination of the role of F0 when interpreting speaker-group related differences in formant patterns.
Existing databases of isolated vowel sounds or vowel sounds embedded in consonantal context generally document only limited variation of basic production parameters. Thus, concerning the possible variation range of vowel and voice quality-related sound characteristics, there is a lack of broad phenomenological and descriptive references that allow for a comprehensive understanding of vowel acoustics and for an evaluation of the extent to which corresponding existing approaches and models can be generalised. In order to contribute to the building up of such references, a novel database of vowel sounds that exceeds any existing collection by size and diversity of vocalic characteristicsis presented here, comprised of c. 34600 utterances of 70 speakers (46 nonprofessional speakers, children, women and men, and 24 professional actors/actresses and singers of straight theatre, contemporary singing, and European classical singing). The database focuses on sounds of the long Standard German vowels /i-y-e-ø-a-o-u/ produced with varying basic production parameters such as phonation type, vocal effort, fundamental frequency, vowel context and speaking or singing style. In addition, a read text and, for professionals, songs are also included. The database is accessible for scientific use, and further extensions are in progress.
When investigating formant pattern and spectral shape ambiguity in Klatt synthesis, an earlier study showed that the perceived vowel quality of Standard German vowel sounds can be changed by varying fundamental frequency only [Maurer et al. (2017). J. Acoust. Soc. Am. 141(5):3469-3470]. In this follow-up study, the previous original synthesis experiment was repeated twice, firstly, with fundamental frequencies (fo) of the corresponding sounds lowered by one octave, and secondly, with different ratios of the first and second formant amplitudes. Here, the role of the fo range and the formant amplitudes for the investigation of formant pattern and spectral shape ambiguity in vowel synthesis was further examined. The same five phonetic expert listeners that participated in the previous experiment also identified all of the newly synthesised sounds in a multiple-choice identification task. Results revealed that the perceived vowel quality only changes for fos above 200 Hz and that, for back vowels, the ratio of the formant amplitudes used in the Klatt synthesis also affects vowel recognition. Thus, the results of the experiments confirm earlier indications of a non-systematic relation between fo or pitch and formant patterns or spectral envelopes for vowel recognition.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.