2012 IEEE RO-MAN: The 21st IEEE International Symposium on Robot and Human Interactive Communication 2012
DOI: 10.1109/roman.2012.6343815
|View full text |Cite
|
Sign up to set email alerts
|

Expressive synthetic voices: Considerations for human robot interaction

Abstract: As speech synthesis technology develops more advanced paralinguistic capabilities, open questions emerge regarding how humans perceive the use of such vocal capabilities by robots. Perceptions of spoken interaction are complex and influenced by multiple factors including the linguistic content of a message, social context, perceived intelligence of the agent, and form factor of its embodiment. This paper shares results from a study that controlled for the above factors in order to investigate the effect on hum… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
8
0

Year Published

2015
2015
2022
2022

Publication Types

Select...
3
3
1

Relationship

0
7

Authors

Journals

citations
Cited by 7 publications
(8 citation statements)
references
References 25 publications
0
8
0
Order By: Relevance
“…This prediction stems from related work conducted in laboratory settings with other types of interlocutors (e.g., robot in Gallé et al, 2017;Marge et al, 2010), with greater expressiveness of the voice relating to positive ratings by users (e.g. Hennig & Chellali, 2012).…”
Section: Experiments 1: Chatbot User Studymentioning
confidence: 99%
See 1 more Smart Citation
“…This prediction stems from related work conducted in laboratory settings with other types of interlocutors (e.g., robot in Gallé et al, 2017;Marge et al, 2010), with greater expressiveness of the voice relating to positive ratings by users (e.g. Hennig & Chellali, 2012).…”
Section: Experiments 1: Chatbot User Studymentioning
confidence: 99%
“…Furthermore, no prior experiments have parametrically tested the presence of these two elements in controlled studies; doing so allows us to test whether there is a cumulative effect of these cognitive-emotional insertions. Finally, conducting an experiment directly through the Alexa system is an innovative approach that builds on past work that has largely relied on naturalness ratings of synthetic voices with no interactive component for the rater themselves (e.g., Marge et al, 2010;Gálvez et al, 2017;Hennig & Chellali, 2012;Schmitz et al, 2007).…”
Section: Introductionmentioning
confidence: 99%
“…The most commonly used evaluation protocols for TTS are listening tests with respect to quality, naturalness, intelligibility, similarity and expressiveness [7,8,9,10]. Applicationdependent measures are also used, such as those for audiobook reading [11] and spoken dialogue systems or human-robot interaction [12,13,14,15,16,17,18]. For example, Su et al showed that emotional TTS for healthcare systems can be used to enable systems to provide warmer feedback of the system [16].…”
Section: Introductionmentioning
confidence: 99%
“…On the other hand, computers may also speak to humans by synthetic voice (Juan et al, 2015) and listen to us using speech recognition. To understand these processes for both human and machine, we have to study carefully the structures and functions of spoken language: how to produce and perceive it and how speech technology may help us to communicate (Hennig and Chellali, 2012) Speech segmentation is the process of splitting the speech into separately words and each word is saved in separated audio file for the upcoming processing as shown in Fig. 1.…”
Section: Introductionmentioning
confidence: 99%