2020
DOI: 10.1007/978-3-030-51870-7_4
|View full text |Cite
|
Sign up to set email alerts
|

“Speech Melody and Speech Content Didn’t Fit Together”—Differences in Speech Behavior for Device Directed and Human Directed Interactions

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

2
2
1

Year Published

2021
2021
2024
2024

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 9 publications
(8 citation statements)
references
References 35 publications
2
2
1
Order By: Relevance
“…misrecognition in computer-DS (Vertanen, 2006), as well as higher mean f0 and a larger f0 range in Lombard speech (Brumm and Zollinger, 2011;Marcoux and Ernestus, 2019). Furthermore, in contrast to other work reporting greater intensity in Alexa-DS (Raveh et al, 2019;Siegert and Krüger, 2021), we did not see a difference in intensity in the present study. This might reflect the controlled interaction, where participants were recorded with a head-mounted microphone (such that it was equidistant from their mouths for the entire experiment) and heard amplitude normalized stimuli over headphones.…”
Section: Discussioncontrasting
confidence: 99%
See 3 more Smart Citations
“…misrecognition in computer-DS (Vertanen, 2006), as well as higher mean f0 and a larger f0 range in Lombard speech (Brumm and Zollinger, 2011;Marcoux and Ernestus, 2019). Furthermore, in contrast to other work reporting greater intensity in Alexa-DS (Raveh et al, 2019;Siegert and Krüger, 2021), we did not see a difference in intensity in the present study. This might reflect the controlled interaction, where participants were recorded with a head-mounted microphone (such that it was equidistant from their mouths for the entire experiment) and heard amplitude normalized stimuli over headphones.…”
Section: Discussioncontrasting
confidence: 99%
“…Overall, we found prosodic differences across Alexa-and human-DS, consistent with routinized interaction accounts that propose people have a "routinized" way of engaging with technology (Gambino et al, 2020), and in line with prior studies finding differences in computer and voice-AI speech registers (e.g., Burnham et al, 2010;Huang et al, 2019;Siegert and Krüger, 2021). In the present study, speakers showed a systematic Alexa-DS speech style: when talking to Alexa, speakers produced sentences with a slower rate, higher mean f0, and higher f0 variation, relative to human-DS.…”
Section: Discussionsupporting
confidence: 90%
See 2 more Smart Citations
“…One possibility that the advanced speech capabilities in Alexa socialbots (in terms of speech recognition, language understanding and generation) might lead to more naturalistic interactions, whereby users talk to the system more as they would an adult human interlocutor. Alternatively, there is work showing that listeners rate 'robotic' text-to-speech (TTS) voices as less communicatively competent than more human-like voices (Cowan et al, 2015) and that listeners perceive prosodic peculiarities in the Alexa voice, describing it as being 'monotonous' and 'robotic' (Siegert and Krüger, 2020). Accordingly, an alternative prediction is that speakers will use a slower speaking rate when talking to the Alexa socialbot, since robotic voices are perceived as being less communicatively competent.…”
Section: Introductionmentioning
confidence: 99%