2023
DOI: 10.3389/fcomm.2023.1116955
|View full text |Cite
|
Sign up to set email alerts
|

Siri, you've changed! Acoustic properties and racialized judgments of voice assistants

Abstract: As speech technology is increasingly integrated into modern American society, voice assistants are a more significant part of our everyday lives. According to Apple, Siri fulfills 25 billion requests each month. As part of a software update in April 2021, users in the U.S. were presented with a choice of 4 Siris. While in beta testing, users on Twitter began to comment that they felt that some of the voices had racial identities, noting in particular that Voice 2 and Voice 3 “sounded black.” This study tests w… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
0
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
2
1
1

Relationship

1
3

Authors

Journals

citations
Cited by 4 publications
(5 citation statements)
references
References 31 publications
0
0
0
Order By: Relevance
“…Thus, even though there is less alignment toward device interlocutors, suggesting that device interlocutors are viewed as socially distinct from humans, people still apply gender stereotypes to technological agents based on the properties of the voice alone. More recent work finds similar biases in evaluation of robots, smart speakers, and voice assistants based on social-indexical properties of the voices (Ernst and Herm-Stapelberg, 2020;Holliday, 2023;and see Sutton et al, 2019 for discussion of biases and speech-based attitudes and discrimination as relevant for voice-AI design). The question of how such biases play out in vocal alignment behavior toward voice-AI is an open question for future work.…”
Section: Vocal Alignment Toward Speech Technologymentioning
confidence: 90%
See 3 more Smart Citations
“…Thus, even though there is less alignment toward device interlocutors, suggesting that device interlocutors are viewed as socially distinct from humans, people still apply gender stereotypes to technological agents based on the properties of the voice alone. More recent work finds similar biases in evaluation of robots, smart speakers, and voice assistants based on social-indexical properties of the voices (Ernst and Herm-Stapelberg, 2020;Holliday, 2023;and see Sutton et al, 2019 for discussion of biases and speech-based attitudes and discrimination as relevant for voice-AI design). The question of how such biases play out in vocal alignment behavior toward voice-AI is an open question for future work.…”
Section: Vocal Alignment Toward Speech Technologymentioning
confidence: 90%
“…This has been shown to apply to voice-AI as well: users perceive male voice assistants as more competent than female voice assistants (Ernst and Herm-Stapelberg, 2020). Since voice-based stereotyping also occurs based on the racial and age-based cues present in talkers' speech (e.g., Kurinec and Weaver, 2021 for race; e.g., Hummert et al, 2004 for age), we predict that similar biases in judgments of communicative competence vary based on apparent ethnicity and age of device voices [see discussion of Holliday (2023) and related work in section 3]. Whether these factors influence patterns and extent of pronunciation adjustments present in device-DS is a ripe question for future work.…”
Section: User Speech Variation In Production During Human-computer In...mentioning
confidence: 94%
See 2 more Smart Citations
“…In particular, they highlight that using language is one such "cue". In support of this view, speakers have been shown to vocally align their speech when talking to voice-AI interlocutors similarly to human interlocutors (Cohn, Predeck, et al, 2021;Zellou, Cohn, & Ferenc Segedin, 2021), and a growing body of work has shown that people perceive social attributes of voice-AI, including gender, age, race/ethnicity, and emotion (Cohn et al, 2019;Ernst & Herm-Stapelberg, 2020;Gessinger et al, 2022;Holliday, 2023;Zellou, Cohn, & Ferenc Segedin, 2021). In the present study, finding similar prosodic focus marking would suggest that the acoustic realization of information structure is part of this application of human-human social rules to voice-AI, suggesting that equivalence supersedes adaptations for a less-than-rational listener.…”
Section: Rational Listener Hypothesismentioning
confidence: 97%