The Chong language uses a combination of different acoustic correlates to distinguish among its four contrastive registers (phonation types). Electroglottographic (EGG) and acoustic data were examined from original fieldwork on the Takhian Thong dialect. EGG data shows high open quotient (OQ) values for the breathy register, low OQ values for the tense register, intermediate OQ values for the modal register, and rapidly changing high to low OQ values for the breathy-tense register. Acoustic correlates indicate that H1-A3 best distinguishes between breathy and non-breathy phonation, but measures like H1-H2 and pitch are necessary to discriminate between tense and non-tense phonation. A comparison of spectral tilt and OQ measures shows the greatest correlation between OQ and H1-H2, suggesting that changes in the relative amplitude of frequencies in the upper spectrum are not directly related to changes in the open period of the glottal cycle. OQ is best correlated with changes in the degree of glottal tension.
While efforts to document endangered languages have steadily increased, the phonetic analysis of endangered language data remains a challenge. The transcription of large documentation corpora is, by itself, a tremendous feat. Yet, the process of segmentation remains a bottleneck for research with data of this kind. This paper examines whether a speech processing tool, forced alignment, can facilitate the segmentation task for small data sets, even when the target language differs from the training language. The authors also examined whether a phone set with contextualization outperforms a more general one. The accuracy of two forced aligners trained on English (HMALIGN and P2FA) was assessed using corpus data from Yolox ochitl Mixtec. Overall, agreement performance was relatively good, with accuracy at 70.9% within 30 ms for HMALIGN and 65.7% within 30 ms for P2FA. Segmental and tonal categories influenced accuracy as well. For instance, additional stop allophones in HMALIGN's phone set aided alignment accuracy. Agreement differences between aligners also corresponded closely with the types of data on which the aligners were trained. Overall, using existing alignment systems was found to have potential for making phonetic analysis of small corpora more efficient, with more allophonic phone sets providing better agreement than general ones.
Itunyoso Trique /itunˈjoso ˈtɾiki/ is an Oto-Manguean language (Mixtecan branch) spoken in the town of San Martín Itunyoso, Oaxaca, Mexico. It is one of three Trique languages, all of which are spoken in Oaxaca, Mexico. According to the 2005 census (INEGI 2005), there are 1,345 inhabitants in the town, virtually all of whom speak Itunyoso Trique as a native language. However, this number does not reflect the total number of speakers, as approximately 30%–50% of the population lives outside of San Martín Itunyoso at any given time. The population of the nearby town of Concepción Itunyoso, with a population of 261 (ibid.), is considered to speak the same dialect. The remaining populations of speakers are found in Oaxaca City, Mexico City, and the United States.
This paper examines the perceptual weight of cues to the coda glottal consonant contrast in Trique (Oto-Manguean) with native listeners. The language contrasts words with no coda (/Vː/) from words with a coda glottal stop (/VɁ/) or breathy coda (/Vɦ/). The results from a speeded AX (same-different) lexical discrimination task show high accuracy in lexical identification for the /Vː/-/Vɦ/ contrast, but lower accuracy for the other contrasts. The second experiment consists of a labeling task where the three acoustic dimensions that distinguished the glottal consonant codas in production [duration, the amplitude difference between the first two harmonics (H1-H2), and F0] were modified orthogonally using step-wise resynthesis. This task determines the relative weight of each dimension in phonological categorization. The results show that duration was the strongest cue. Listeners were only sensitive to changes in H1-H2 for the /Vː/-/Vɦ/ and /Vː/-/VɁ/ contrasts when duration was ambiguous. Listeners were only sensitive to changes in F0 for the /Vː/-/Vɦ/ contrast when both duration and H1-H2 were ambiguous. The perceptual cue weighting for each contrast closely matches existing production data [DiCanio (2012 a). J. Phon. 40, 162-176] Cue weight differences in speech perception are explained by differences in step-interval size and the notion of adaptive plasticity [Francis et al. (2008). J. Acoust. Soc. Am. 124, 1234-1251; Holt and Lotto (2006). J. Acoust. Soc. Am. 119, 3059-3071].
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.