Both historical sound change and laboratory confusion studies show strong asymmetries of consonant confusions. Historically [ki] commonly changes to [t∫i] (e.g., English chill, cognates with cool), but not the reverse. Similarly, Winitiz et al. [J. Acoust. Soc. Am. 51, 1309–1317 (1972)] in a consonant confusion study, found [ki] confused with [ti] more often than the reverse. It is hypothesized that such asymmetries arise when two sounds are acoustically similar except for one or more differentiating cues, which cues are subject to a highly directional perceptual error. For example, if soundx possesses a cue that y lacks, listeners are more likely to miss that cue than introduce it spuriously. /k/ and /t/ before /i/ have similar formant transitions but differ in their burst spectra: /k/ has a sharp mid-frequency peak that /t/ lacks. Listeners are more likely to miss the spectral peak for /k/ than introduce it in the burst of /t/. These consonant confusion studies of Italian syllables support this hypothesis: Italian listeners confused /ki/ with /ti/ with increasing asymmetry when the S/N ratio increased (where noise masks the burst more than the formant transitions) and when the burst was excised completely. Implications for phonetic theory and speech technology will be discussed.
Abstract. The paper describes a field evaluation of the automated 'reverse directory assistance' service presently in use in Italy in which information about names and addresses is provided by a TTS system. A simulation of the service using a natural voice was also run to get comparative data. Both services were accessed from an office room and a call-box on the street. Different evaluation metrics, such as intelligibility, task completion, task correctness, transaction success, and user's reactions were used. The aim of the work was to evaluate TTS synthesis in real world use and to make a comparison between laboratory data and data on system performance in a real application. Such a comparison suggested that in laboratory tests more attention should be dedicated to simulate more closely the conditions that can be predicted in real world use, by including important aspects that are generally not taken into consideration in laboratory tests and that are likely to have a large influence on TTS system performance such as environmental noise, prosody, and task complexity. The results also underline the importance of field evaluations to get an overall view of the usability of a service in real applications and with users who are as similar as possible to actual users.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.