In this paper we explore two methods for the classification of fricatives. First, for the coding of the speech, we compared two sets of acoustic measures obtained from a corpus of Romanian fricatives: (a) spectral moments, and (b) cepstral coefficients. Second, we compared two methods of determining the regions of the segments from which the measures would be extracted. In the first method, the phonetic segments were divided into three regions of approximately equal duration. In the second method, Hidden Markov Models (HMMs) were used to divide each segment into three regions such that the variances of the measures within each region were minimized. The corpus we analyzed consists of 3,674 plain and palatalized word-final fricatives from four places of articulation, produced by 31 native speakers of Romanian (20 females). We used logistic regression to classify fricatives by place, voicing, palatalization status, and gender. We found that cepstral coefficients reliably outperformed spectral moments in all classification tasks, and that using regions determined by HMM yielded slightly higher correct classification rates than using regions of equal duration.
This study compares two methods for classifying voiceless sibilant fricatives forming a 4-way phonemic contrast found in Russian, but otherwise cross-linguistically rare. One method uses spectral measures, i.e. vowel formants, COG, duration and intensity of frication. The second method uses cepstral coefficients extracted from different regions inside fricatives and neighboring vowels. The corpus comprises 1,431 plain and palatalized fricatives from two places of articulation, produced by 10 speakers. Logistic regression was used to classify the productions of males and females together and separately. The productions of females yielded higher correct classification rates (highest 91.9%). Cepstral measures outperformed spectral measures across-the-board.
GoDDARD, C. I., J. W. LrI-r-nv, ll.u J. S. Tarr. 1974. Effects of M.S. 222 anesthetization on temperature selection in lake trout, Salvelinus namaycush. J. Fish. Res. Board Can. 31: 100-103.Yearling lake trout (Salvelinus namaycush) were anesthetized with a 150 ppm solution of M.S. 222 for 2 min at 10 C. When tested in a vertical temperature gradient, their behavior was abnormal for 5 days following anesthetization. Initially, they remained at the bottom of the gradient tank, as much as 63/6 of the time, and when they did swim up into the gradient, their temperature selection was much less precise than that of contt'ol fish. The percentage of fish on the bottom declined daily, and on the 6th day their temperature distribution did not differ from that of controls. Gopp.qno, C. I., J. W. Lrr,rrY, aNo J. S. T.q.rr. 1974. Effects of M.5.222 anesthetization on temperature selection in lake trout, Salvelinus namaycush. J' Fish. Res' Board Can. 31: 100-103.Des touladis (Salvelinus namaycush) Ag6s d'un an ont 6t6 anesth6si6s d I'aide d'une solution de 150 ppm de M.S. 222pendant 2 minutes e 10 C. Le comportement des poissons plac6s dans un gradient vertical de tempdratures est anormal pendant 5 jours aprds I'anesth6sie. Au d6but, ils passent sur le fond du bassin jusqu'd 637o du temps et, quand ils remontent vers 1e gradient, leur choix de tempdratures est beaucoup moins precis que celui des t6moins. Le pourcentage d'observations de poissons sur le fond diminue chaque jour, et, au 6e jour, leur r6partition dans le gradient ne diffdre pas de celle des t6moins.
In the current study, we explore the factors underlying the well-known difficulty in acoustic classification of front nonsibilant fricatives (Maniwa, Jongman & Wade 2009, McMurray & Jongman 2011) by applying a novel classification method to the production of Greek speakers. The Greek fricative inventory [f v θ ð s z ç ʝ x ɣ] includes voiced and voiceless segments from five distinct places of articulation. Our corpus contains all of the Greek fricatives produced by 29 monolingual speakers, but our focus is on the distinction between the front nonsibilant fricatives [f v θ ð]. For comparison, we also discuss the other places of articulation where relevant. We apply a relatively novel classification method based on cepstral coefficients, previously successful in categorizing English obstruent bursts (Bunnell, Polikoff & McNicholas 2004), English vowels (Ferragne & Pellegrino 2010), Romanian fricatives (Spinu & Lilley 2016), and Russian fricatives (Spinu, Kochetov & Lilley 2018). For this study, fricative boundaries were automatically aligned using Hidden Markov Models (HMMs) and then manually checked. Six Bark-frequency cepstral coefficients (c0–c5) were extracted from 20-millisecond Hann windows. HMMs were used to divide the fricatives and adjacent vowels into three regions of internally minimized variance. A multinomial logistic regression analysis then used the mean cepstral coefficients from each region as predictors for classification by consonant identity. Our method yields highly successful classification rates, exceeding the performance of previous methods. We discuss these results in light of the differences of the phonemic distributions of fricatives between English and Greek.
We will demonstrate the ModelTalker Voice Recorder (MT Voice Recorder)-an interface system that lets individuals record and bank a speech database for the creation of a synthetic voice. The system guides users through an automatic calibration process that sets pitch, amplitude, and silence. The system then prompts users with both visual (text-based) and auditory prompts. Each recording is screened for pitch, amplitude and pronunciation and users are given immediate feedback on the acceptability of each recording. Users can then rerecord an unacceptable utterance. Recordings are automatically labeled and saved and a speech database is created from these recordings. The system's intention is to make the process of recording a corpus of utterances relatively easy for those inexperienced in linguistic analysis. Ultimately, the recorded corpus and the resulting speech database is used for concatenative synthetic speech, thus allowing individuals at home or in clinics to create a synthetic voice in their own voice. The interface may prove useful for other purposes as well. The system facilitates the recording and labeling of large corpora of speech, making it useful for speech and linguistic research, and it provides immediate feedback on pronunciation, thus making it useful as a clinical learning tool.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.