The Speaker and Language Recognition Workshop (Odyssey 2018) 2018
DOI: 10.21437/odyssey.2018-17
|View full text |Cite
|
Sign up to set email alerts
|

Incremental On-Line Clustering of Speakers' Short Segments

Abstract: This paper deals with clustering of speakers' short segments, in a scenario where additional segments continue to arrive and should be constantly clustered together with previous segments that were already clustered. In realistic applications, it is not possible to cluster all segments every time a new segment arrives. Hence, incremental clustering is applied in an on-line mode. New segments can either belong to existing speakers, therefore, have to be assigned to one of the existing clusters, or they could be… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3

Citation Types

0
3
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
3

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(3 citation statements)
references
References 15 publications
0
3
0
Order By: Relevance
“…Another example is online speaker diarization or clustering [4][5][6][7][8][9]. In this case, short speech segments from an audio stream have to be classified with low latency (e.g.…”
Section: Introductionmentioning
confidence: 99%
“…Another example is online speaker diarization or clustering [4][5][6][7][8][9]. In this case, short speech segments from an audio stream have to be classified with low latency (e.g.…”
Section: Introductionmentioning
confidence: 99%
“…in a multispeaker audio stream [1]. Some of the practical applications of diarization technology include information retrieval [2], broadcast news, meeting conversations, telephone calls, VoIP, digital audio logging [3] and interaction analysis in Peer-Led Team Learning (PLTL) groups [4,5,6,7]. Diarization is a challenging task for naturalistic audio streams as they contain short conversational turns, overlapped speech, noise and reverberation [8,9].…”
Section: Introductionmentioning
confidence: 99%
“…NIST Rich Transcription (RT) evaluations involved broadcast news data and meeting recordings for diarization study [1]. Summed-channel telephone speech from NIST SRE evaluations have been used in diarization studies [4].…”
Section: Introductionmentioning
confidence: 99%