Interspeech 2020 2020
DOI: 10.21437/interspeech.2020-1335
|View full text |Cite
|
Sign up to set email alerts
|

Social and Functional Pressures in Vocal Alignment: Differences for Human and Voice-AI Interlocutors

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

2
8
2

Year Published

2021
2021
2023
2023

Publication Types

Select...
4
1
1

Relationship

3
3

Authors

Journals

citations
Cited by 7 publications
(12 citation statements)
references
References 13 publications
2
8
2
Order By: Relevance
“…Additionally, we did not observe differences in how participants adapted their speech following an emotionally expressive or neutral word misrecognition. This contrasts with related work on this same corpus (Zellou and Cohn, 2020) that found greater vowel duration alignment when participants responded to an emotionally expressive word misunderstanding made by a voice-AI system. Thus, it is possible that emotional expressiveness might shape vocal alignment, but it might not influence speech style adjustments.…”
Section: Discussioncontrasting
confidence: 99%
See 3 more Smart Citations
“…Additionally, we did not observe differences in how participants adapted their speech following an emotionally expressive or neutral word misrecognition. This contrasts with related work on this same corpus (Zellou and Cohn, 2020) that found greater vowel duration alignment when participants responded to an emotionally expressive word misunderstanding made by a voice-AI system. Thus, it is possible that emotional expressiveness might shape vocal alignment, but it might not influence speech style adjustments.…”
Section: Discussioncontrasting
confidence: 99%
“…Alternatively, the presence of emotionality might lead to distinct clear speech strategies for the human and voice-AI interlocutors. For example, a study of phonetic alignment (using the same corpus in the current study) found that vowel duration alignment differed both by the social category of interlocutor (human vs. voice-AI) and based on emotionality (Zellou and Cohn, 2020): participants aligned more in response to a misrecognition, consistent with H&H theory (Lindblom, 1990), which increased even more when the voice-AI talker was emotionally expressive when conveying their misunderstanding (e.g., "Bummer! I'm not sure I understood.…”
Section: Different Strategies To Improve Intelligibility Following a Misrecognition?supporting
confidence: 59%
See 2 more Smart Citations
“…Socially-mediated imitation patterns are often interpreted through the lens of Communication Accommodation Theory (CAT) (Giles et al, 1991;Shepard, 2001), which proposes that speakers use linguistic alignment to emphasize or minimize social differences between themselves and their interlocutors. The CAT framework can also be applied to understand humandevice interaction: recent studies that make a direct comparison between human and voice-AI interlocutors found greater vocal imitation for the human, relative to the voice-AI speaker (e.g., Apple's Siri in Cohn et al, 2019;Snyder et al, 2019;Amazon's Alexa in Raveh et al, 2019;Zellou and Cohn, 2020). Less speech alignment toward digital device assistants suggests that people may be less inclined to demonstrate social closeness toward voice-AI, as they do for humans.…”
Section: Introductionmentioning
confidence: 99%