Interspeech 2021 2021
DOI: 10.21437/interspeech.2021-2041
|View full text |Cite
|
Sign up to set email alerts
|

Look Who’s Talking: Active Speaker Detection in the Wild

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 8 publications
(2 citation statements)
references
References 0 publications
0
2
0
Order By: Relevance
“…Several AVSR architectures have been proposed [4,10,17,13,16,22,23] which show that the improvement over ASR models is greater as the noise level increases, i.e., the SNR is lower. The same VSR architectures can also be used to improve the performance of audio-based models in a variety of applications like speech enhancement [24], speech separation [25,26], voice activity detection [27], active speaker detection [28] and speaker diarisation [29].…”
Section: Applicationsmentioning
confidence: 99%
“…Several AVSR architectures have been proposed [4,10,17,13,16,22,23] which show that the improvement over ASR models is greater as the noise level increases, i.e., the SNR is lower. The same VSR architectures can also be used to improve the performance of audio-based models in a variety of applications like speech enhancement [24], speech separation [25,26], voice activity detection [27], active speaker detection [28] and speaker diarisation [29].…”
Section: Applicationsmentioning
confidence: 99%
“…The goal of noise-tolerant speaker diarization is to achieve improved performance in noisy environments. A recent work [19] tackles this problem using the auto-encoder architecture as a dimensionality reduction module. They extract two low-dimensional codes from speaker embeddings, representing the speaker identity and irrelevant noise information, then remove the noise factors.…”
Section: Introduction and Related Workmentioning
confidence: 99%