Interspeech 2022 2022
DOI: 10.21437/interspeech.2022-10363
|View full text |Cite
|
Sign up to set email alerts
|

Online Speaker Diarization with Core Samples Selection

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
3
3
1

Relationship

0
7

Authors

Journals

citations
Cited by 7 publications
(3 citation statements)
references
References 0 publications
0
3
0
Order By: Relevance
“…For evaluating online diarization, we used FW-STB with EEND-EDA based on four-stacked Transformers [26]. In addition, we referred to the results of various conventional online diarization methods [24], [26], [50]- [52], [70], [71] on various datasets. Some cascaded comparison methods [50], [51], [70] used the oracle SAD; for a fair comparison, we used SAD post-processing The values are from the original FW-STB paper [26].…”
Section: Experimental Settingsmentioning
confidence: 99%
“…For evaluating online diarization, we used FW-STB with EEND-EDA based on four-stacked Transformers [26]. In addition, we referred to the results of various conventional online diarization methods [24], [26], [50]- [52], [70], [71] on various datasets. Some cascaded comparison methods [50], [51], [70] used the oracle SAD; for a fair comparison, we used SAD post-processing The values are from the original FW-STB paper [26].…”
Section: Experimental Settingsmentioning
confidence: 99%
“…Speaker diarisation (SD), which segments input audio to short utterances according to speaker identity, is going through a rapid breakthrough [1,2]. Based on the success of recent SD systems [3][4][5][6][7][8][9][10][11][12], online SD systems are also being developed [13][14][15][16][17][18][19][20]. In an online SD system, the system should decide the speaker label of a given short segment leveraging only current and past segments, where only a part of past segments are available.…”
Section: Introductionmentioning
confidence: 99%
“…In [15], authors modified the agglomerative hierarchical clustering (AHC) algorithm, widely adopted in offline SD systems, and proposed a checkpoint AHC with the label matching algorithm. Authors of [16] adopted a memory module for each speaker and contained selected embeddings, where VBx [9] and cosine operations on centroids were used for clustering. Wang et al [17] adapted target speaker voice activity detection (TS-VAD), a successful offline SD framework, to online SD scenarios [8,22].…”
Section: Introductionmentioning
confidence: 99%