Progress in Speech Synthesis 1997
DOI: 10.1007/978-1-4612-1894-4_43
|View full text |Cite
|
Sign up to set email alerts
|

Perception of Synthetic Speech

Abstract: This chapter sununarizes the results we obtained over the last 15 years at Indiana University on the perception of synthetic speech produced by rule. A wide variety of behavioral studies have been carried out on phoneme intelligibil ity, word recognition, and comprehension to learn more about how human listeners perceive and understand synthetic speech. Some of this research, particularly the earlier studies on segmental intelligibility, was directed toward applied issues deal ing with perceptual evaluation an… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
4
0

Year Published

2000
2000
2023
2023

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 30 publications
(5 citation statements)
references
References 27 publications
(22 reference statements)
0
4
0
Order By: Relevance
“…On-line techniques may be more sensitive. Pisoni (1987Pisoni ( , 1997 used reaction time measures to compare and evaluate perfectly intelligible synthetic speech systems. Reaction time measures, such as lexical decision time or phoneme detection time, are assumed to reflect the speed with which different types of speech can be processed.…”
Section: Experiments 1: Linear Vs Nonlinear Time Compressionmentioning
confidence: 99%
“…On-line techniques may be more sensitive. Pisoni (1987Pisoni ( , 1997 used reaction time measures to compare and evaluate perfectly intelligible synthetic speech systems. Reaction time measures, such as lexical decision time or phoneme detection time, are assumed to reflect the speed with which different types of speech can be processed.…”
Section: Experiments 1: Linear Vs Nonlinear Time Compressionmentioning
confidence: 99%
“…In prior literature, researchers have investigated the perceptions of synthetic speech in varied behavioral settings [14], but have not investigated how properties of machine-like voice such as pitch contour and flanging vary on a machine-to-human spectrum or influence user trust. Accordingly, in this study we examined two main research questions: 1) Do properties of machine-like speech vary on a machineto-human spectrum?…”
Section: Introductionmentioning
confidence: 99%
“…Although Kanzi’s performance with computer-generated speech was lower than with natural speech (both unmanipulated and degraded), he still chose the correct lexigram at a rate significantly higher than chance for the unmanipulated computer-generated stimuli and for the sinusoidally degraded versions of these stimuli. It is worth noting that the specific synthesis mode (formant synthesis) used for creating the computer-generated stimuli was overtly crude and un-human like (robotic-like) and previous work has shown that humans can struggle with such computer-generated stimuli, primarily when presented with novel phrases (Pisoni 1997 ). That Kanzi could still understand even some degraded versions of formant synthesised computer-generated stimuli demonstrates the existence of perceptual mechanisms in bonobos that are remarkably resilient when presented with highly deviant, non-natural speech.…”
Section: Discussionmentioning
confidence: 99%