2022
DOI: 10.1016/j.wocn.2021.101123
|View full text |Cite
|
Sign up to set email alerts
|

Acoustic-phonetic properties of Siri- and human-directed speech

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
9
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
8
1

Relationship

2
7

Authors

Journals

citations
Cited by 31 publications
(40 citation statements)
references
References 72 publications
0
9
0
Order By: Relevance
“…Speakers produce 'clear speech' when there is a reason to believe their listener will have trouble comprehending the signal. Clear speech is characterized by a variety of acoustic modifications relative to casual or conversational speech, such as slowing their speaking rate and producing more extreme segmental articulations (Picheny et al, 1986;Krause & Braida, 2002;Uchanski, 2005;Smiljanić & Bradlow, 2009;Dilley et al, 2014;Cohn & Zellou, 2021;Cohn et al, 2022). Speaking clearly has repeatedly been shown to benefit listeners by increasing intelligibility (e.g.…”
Section: A Clear Speechmentioning
confidence: 99%
“…Speakers produce 'clear speech' when there is a reason to believe their listener will have trouble comprehending the signal. Clear speech is characterized by a variety of acoustic modifications relative to casual or conversational speech, such as slowing their speaking rate and producing more extreme segmental articulations (Picheny et al, 1986;Krause & Braida, 2002;Uchanski, 2005;Smiljanić & Bradlow, 2009;Dilley et al, 2014;Cohn & Zellou, 2021;Cohn et al, 2022). Speaking clearly has repeatedly been shown to benefit listeners by increasing intelligibility (e.g.…”
Section: A Clear Speechmentioning
confidence: 99%
“…We therefore vary whether the guise of the talker is congruent (shown an image of a device) or incongruent (shown an image of a human). If alignment is driven by functional reasons, we expect participants to align the most toward device-guise voices in an effort to communicate more effectively (Cowan et al, 2015;Cohn et al, 2022). Conversely, if alignment is driven by similarity attraction (Byrne, 1971), we might expect participants to align more toward human-guise voices (Gessinger et al, 2021).…”
Section: Current Studymentioning
confidence: 97%
“…They also found, however, that voice type overall (human or device) was not a significant predictor of alignment patterns, suggesting that the acoustic differences between human and TTS voices were not the main driver of differences in alignment patterns. Other work has shown that people have distinct expectations about the communicative competence of technology; for example, participants explicitly rate a TTS voice as less competent and less human-like than a human voice (Cohn et al, 2022) and more robotic TTS voices as less competent, relative to more human-like TTS voices (Cowan et al, 2015;Zellou et al, 2021a). Additionally, given the identical guise for a talker-cued by an image of a human or device silhouette -listeners show worse performance on a speech-in-noise task (Aoki et al, 2022).…”
Section: Introductionmentioning
confidence: 99%
“…Interestingly, this prediction has also been extended to voice-AI (Uther et al, 2007): the basic idea is that listeners treat such voice-AI devices as if they require enhanced speech input. Recent findings have corroborated this prediction (Burnham et al, 2010;Cohn et al, , 2022. For example, examined the adjustments that speakers made in response to a misrecognition by a human or by voice-AI (Amazon's Alexa).…”
Section: Introductionmentioning
confidence: 93%