2006
DOI: 10.1016/j.csl.2005.08.002
|View full text |Cite
|
Sign up to set email alerts
|

Step-by-step and integrated approaches in broadcast news speaker diarization

Abstract: International audienceThis paper summarizes the collaboration of the LIA and CLIPS laboratories on speaker diarization of broadcast news during the spring NIST Rich Transcription 2003 evaluation campaign (NIST-RTÕ03S). The speaker diarization task consists of segmenting a conversation into homogeneous segments which are then grouped into speaker classes. Two approaches are described and compared for speaker diarization. The first one relies on a classical two-step speaker diarization strategy based on a detect… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

3
102
0
5

Year Published

2006
2006
2017
2017

Publication Types

Select...
6
2
1

Relationship

2
7

Authors

Journals

citations
Cited by 114 publications
(110 citation statements)
references
References 22 publications
3
102
0
5
Order By: Relevance
“…Previous work would seem to support this observation [24]. We report our recent work on system combination in Section IV-E.…”
Section: ) Discrimination and Purificationsupporting
confidence: 51%
See 1 more Smart Citation
“…Previous work would seem to support this observation [24]. We report our recent work on system combination in Section IV-E.…”
Section: ) Discrimination and Purificationsupporting
confidence: 51%
“…A number of combination approaches have been proposed previously, at the clustering stage [24], [31] or at the output stage [32]- [34]. Better performance is usually obtained but, with the exception of [35], none of the previous work considers the combination of both bottom-up and top-down system outputs without further re-segmentation.…”
Section: E Combinationmentioning
confidence: 99%
“…Those that apply segmentation to the MFCC stream, which might be uniform or based on the speaker change detection algorithms (see (Chen & Gopalakrishnam, 1998)), and those that do not apply such segmentation. Following the terminology of (Meignier et al, 2006) we will refer to the former branch as step-by-step algorithms, while to the latter as integrated algorithms. Both algorithmic approaches exploit a certain characteristic that the speaker labels exhibit, which is the temporal continuity.…”
Section: General Algorithmic Approachesmentioning
confidence: 99%
“…For FixSlidHAC pR, we first applied FixSlid with the threshold parameter pRange to segment the input audio stream, then we pruned non-speech regions within the audio segments and grouped the segments using HAC with multiple stages, which have been applied in state-of-the-art speaker diarization systems [8], [7], [35], [30]. As shown in Fig.…”
Section: B Experiments On Broadcast News Data 1) Data Set Descriptionmentioning
confidence: 99%