In this paper the transcription and evaluation of the corpus DIMEx100 for Mexican Spanish is presented. First we describe the corpus and explain the linguistic and computational motivation for its design and collection process; then, the phonetic antecedents and the alphabet adopted for the transcription task are presented; the corpus has been transcribed at three different granularity levels, which are also specified in detail. The corpus statistics for each transcription level are also presented. A set of phonetic rules describing phonetic context observed empirically in spontaneous conversation is also validated with the transcription. The corpus has been used for the construction of acoustic models and a phonetic dictionary for the construction of a speech recognition system. Initial performance results suggest that the data can be used to train good quality acoustic models.
analysis has been complemented with a perceptual study in which the listeners assessed the naturalness and the spontaneity of the same fragments presented with or without lengthening and with or without filled pauses. First of all, results show that the filled pause realized as [eː] is the most frequent one in Spanish. There are no significant differences between the duration of lengthenings and the duration of filled pauses; however, the range of temporal values should be taken into account in the characterization of hesitations. Results also reveal that if the speaker knows that he will not be interrupted, the number of hesitation phenomena increases. Fragments with and without hesitations are perceived as having the same degree of naturalness, but not the same level of spontaneity. Finally, it is interesting to note that in spontaneous speech lengthenings might be perceived as empty pauses.
Three experiments on the perception of lexical stress in Spanish (a free-stress language) by speakers of French (a fixed-stress language) are discussed in this chapter. The main goal of these experiments is to further investigate the effect of an ‘accentual filter’ that may lead to a stress ‘deafness’ in native speakers of a fixed-stress language. Taken together, the results of the three experiments lead to the conclusion that French speakers are not only sensitive to the acoustic cues that convey stress prominences in Spanish, but are also able, after a short training, to encode and retrieve the accentual information in a small lexicon of Spanish pseudowords. However, it appears that French listeners do not always rely on the same acoustic cues as the ones used by native Spanish speakers and that their representations of the accentual patterns seem to be less flexible than the native ones.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.