2021
DOI: 10.3389/fcomm.2020.600361
|View full text |Cite
|
Sign up to set email alerts
|

Age- and Gender-Related Differences in Speech Alignment Toward Humans and Voice-AI

Abstract: Speech alignment is where talkers subconsciously adopt the speech and language patterns of their interlocutor. Nowadays, people of all ages are speaking with voice-activated, artificially-intelligent (voice-AI) digital assistants through phones or smart speakers. This study examines participants’ age (older adults, 53–81 years old vs. younger adults, 18–39 years old) and gender (female and male) on degree of speech alignment during shadowing of (female and male) human and voice-AI (Apple’s Siri) productions. D… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
14
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
5
1
1

Relationship

1
6

Authors

Journals

citations
Cited by 29 publications
(22 citation statements)
references
References 51 publications
1
14
0
Order By: Relevance
“…Additionally, the present study used two types of voices; it is possible that other paralinguistic features of those voices might have mediated speech style adjustments. For example, recent work has shown that speakers align speech differently toward TTS voices that "sound" older (e.g., Apple's Siri voices, rated in their 40 and 50s) (Zellou et al, 2021). Furthermore, there is work showing that introducing "charismatic" features from human speakers' voices shapes perception of TTS voices (Fischer et al, 2019;Niebuhr and Michalsky, 2019).…”
Section: Discussionmentioning
confidence: 99%
See 3 more Smart Citations
“…Additionally, the present study used two types of voices; it is possible that other paralinguistic features of those voices might have mediated speech style adjustments. For example, recent work has shown that speakers align speech differently toward TTS voices that "sound" older (e.g., Apple's Siri voices, rated in their 40 and 50s) (Zellou et al, 2021). Furthermore, there is work showing that introducing "charismatic" features from human speakers' voices shapes perception of TTS voices (Fischer et al, 2019;Niebuhr and Michalsky, 2019).…”
Section: Discussionmentioning
confidence: 99%
“…First, the interlocutor introduced themselves and then went through voice-over instructions with the participant. Participants saw an image corresponding to the interlocutor category: stock images of "adult female" (used in prior work; Zellou et al, 2021) and "Amazon Alexa" (2nd Generation Black Echo).…”
Section: Methodsmentioning
confidence: 99%
See 2 more Smart Citations
“…Conversely, a recent study of the same phenomenon in spontaneous speech found that early Cantonese-English bilinguals were less likely to release final stops in English than non-Cantonese-English bilinguals [20]. These conflicting outcomes simply illustrate the need to examine variation in speech across styles and registers, as this variation has maximum utility for ASR systems and the development of NLP tools for speech and language, given how little is know about how talkers interact with such systems [21].…”
Section: Introductionmentioning
confidence: 99%