2024
DOI: 10.4218/etrij.2023-0358
|View full text |Cite
|
Sign up to set email alerts
|

Joint streaming model for backchannel prediction and automatic speech recognition

Yong‐Seok Choi,
Jeong‐Uk Bang,
Seung Hi Kim

Abstract: In human conversations, listeners often utilize brief backchannels such as “uh‐huh” or “yeah.” Timely backchannels are crucial to understanding and increasing trust among conversational partners. In human–machine conversation systems, users can engage in natural conversations when a conversational agent generates backchannels like a human listener. We propose a method that simultaneously predicts backchannels and recognizes speech in real time. We use a streaming transformer and adopt multitask learning for co… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
1
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 24 publications
0
1
0
Order By: Relevance
“…The ninth paper [9], "Joint streaming model for backchannel prediction and automatic speech recognition" by Choi and others, addresses a crucial aspect of human conversation: the timely use of conversation backchannels such as "uh-huh" or "yeah." This paper introduces a novel method that combines backchannel prediction with real-time speech recognition using a streaming transformer and multitask learning.…”
mentioning
confidence: 99%
“…The ninth paper [9], "Joint streaming model for backchannel prediction and automatic speech recognition" by Choi and others, addresses a crucial aspect of human conversation: the timely use of conversation backchannels such as "uh-huh" or "yeah." This paper introduces a novel method that combines backchannel prediction with real-time speech recognition using a streaming transformer and multitask learning.…”
mentioning
confidence: 99%