2014
DOI: 10.1007/978-3-319-10816-2_72
|View full text |Cite
|
Sign up to set email alerts
|

Speech Synthesis and Uncanny Valley

Abstract: Abstract. The paper discusses a hypothesis relating high quality textto-speech (TTS) synthesis in spoken dialogue systems with the concept of "uncanny valley". It introduces a "Wizard-of-Oz" experiment with 30 volunteers engaged in conversations with two synthetic voices of different naturalness. The results of the experiment are summarized and interpreted, leading to the conclusion that the TTS uncanny valley effect in dialogue systems can probably be superseded and inverted by a positive attitude of the syst… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

2
6
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 13 publications
(8 citation statements)
references
References 3 publications
2
6
0
Order By: Relevance
“…Our non-directional Hypotheses 2a-b, stating that there would be significant group differences in pleasantness and eeriness ratings between the voices, found support in such a way that more human-like voices were experienced as significantly more pleasant and less eerie than more mechanical sounding voices. This is in agreement with prior empirical studies that also observed positive effects of anthropomorphic design features (Romportl, 2014;Baird et al, 2018;Kühne et al, 2020;Roesler et al, 2021). At the same time, it seems to contradict the Uncanny Valley hypothesis (Mori, 1970) according to which we would have expected either the quite realistic yet not perfect voices Synthetic I or II receiving the highest eeriness ratings or alternatively-assuming categorical conflicts as an important mechanism behind uncanny experiences-the real human voice (given that participants were told they were listening to a robot).…”
Section: Discussionsupporting
confidence: 93%
See 2 more Smart Citations
“…Our non-directional Hypotheses 2a-b, stating that there would be significant group differences in pleasantness and eeriness ratings between the voices, found support in such a way that more human-like voices were experienced as significantly more pleasant and less eerie than more mechanical sounding voices. This is in agreement with prior empirical studies that also observed positive effects of anthropomorphic design features (Romportl, 2014;Baird et al, 2018;Kühne et al, 2020;Roesler et al, 2021). At the same time, it seems to contradict the Uncanny Valley hypothesis (Mori, 1970) according to which we would have expected either the quite realistic yet not perfect voices Synthetic I or II receiving the highest eeriness ratings or alternatively-assuming categorical conflicts as an important mechanism behind uncanny experiences-the real human voice (given that participants were told they were listening to a robot).…”
Section: Discussionsupporting
confidence: 93%
“…Anecdotal evidence from two other exploratory studies suggests similar patterns Baird et al (2018) asked 25 listeners to evaluate the likability and human-likeness of 13 synthesized male voices and found likability to increase consistently with human-likeness. Based on data from 30 listeners, also Romportl (2014) reported that most though not all participants preferred a more natural female voice over an artificial sounding one. These results are also in line with two recent meta-analyses that overall show beneficial effects of-here, mostly visual-anthropomorphic design features for embodied robots and chatbots (e.g., on affect, attitudes, trust, or intention to use), although the dependence of these effects on various moderators (e.g., robot type, task type, and field of application) points to more complex relationships between human-likeness and user responses (Blut et al, 2021;Roesler et al, 2021).…”
Section: Human-like Voice As Anthropomorphic Cuementioning
confidence: 99%
See 1 more Smart Citation
“…While a whole range of different attributes were found to be associated with the "uncanny valley" in visual tasks, comparatively little is known about the impact of synthesized voice qualities on likability and eeriness (Kuratate et al, 2009;Mitchell et al, 2011;Romportl, 2014;Chang et al, 2018). For instance, a very recent study evaluated the likability and humanlikeness of a corpus of 13 German male voices, produced via five different synthesis approaches, and found that contrary to the visual "uncanny valley, " likability increases monotonically with human-likeness of the voice (Baird et al, 2018).…”
Section: Visual Perception Of Humanoidsmentioning
confidence: 99%
“…For instance, a very recent study evaluated the likability and humanlikeness of a corpus of 13 German male voices, produced via five different synthesis approaches, and found that contrary to the visual "uncanny valley, " likability increases monotonically with human-likeness of the voice (Baird et al, 2018). A study by Romportl (2014) showed that about three quarters of participants preferred a more natural voice over an artificial one. However, the authors added that affinity to artificial agents might be an important intervening factor in voice perception.…”
Section: Visual Perception Of Humanoidsmentioning
confidence: 99%