2004 IEEE International Conference on Acoustics, Speech, and Signal Processing
DOI: 10.1109/icassp.2004.1326000
|View full text |Cite
|
Sign up to set email alerts
|

The ELISA consortium approaches in broadcast news speaker segmentation during the NIST 2003 rich transcription evaluation

Abstract: This paper presents the ELISA consortium activities in automatic speaker segmentation also known as speaker diarization during the NIST Rich Transcription (RT) 2003 evaluation. The experiments were conducted on real broadcast news data (HUB4). Two different approaches from CLIPS and LIA laboratories are presented and different possibilities of combining them are investigated, in the framework of ELISA consortium. The system submitted as ELISA primary system obtained the second lower segmentation error rate com… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
33
0

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 34 publications
(34 citation statements)
references
References 6 publications
0
33
0
Order By: Relevance
“…A more integrated merging method is described in [49], while [35] describes a way of using the 2002 NIST speaker segmentation error metric to find regions in two inputs which agree and then uses these to train potentially more accurate speaker models. These systems generally produce performance gains, but tend to place some restriction on the systems being combined, such as the required architecture or equalizing the number of speakers.…”
Section: Combining Different Diarization Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…A more integrated merging method is described in [49], while [35] describes a way of using the 2002 NIST speaker segmentation error metric to find regions in two inputs which agree and then uses these to train potentially more accurate speaker models. These systems generally produce performance gains, but tend to place some restriction on the systems being combined, such as the required architecture or equalizing the number of speakers.…”
Section: Combining Different Diarization Methodsmentioning
confidence: 99%
“…Several methods of combining aspects of different diarization systems have been tried, for example the "hybridization" or "piped" CLIPS/LIA systems of [35] and [49] and the "plug and play" CUED/MIT-LL system of [20] which both combine components of different systems together. A more integrated merging method is described in [49], while [35] describes a way of using the 2002 NIST speaker segmentation error metric to find regions in two inputs which agree and then uses these to train potentially more accurate speaker models.…”
Section: Combining Different Diarization Methodsmentioning
confidence: 99%
“…As far as the ELISA piped system [7] is concerned, the two systems seem to be complementary. In theory, we could possibly pipe our segmentation using the Gaussian features to the HMM-based LIA system [7] and get clusters with lower DER. We could apply the same process to the non-Gaussianized system and get clusters with…”
Section: Discussionmentioning
confidence: 99%
“…Several methods of combining different diarization systems exist. One example is the piped system [7] [8] where the segmentation from the CLIPS system is piped to the LIA system for better initialization. Another example is the cluster voting scheme [9] that combines the clusters from two speaker diarization systems.…”
Section: Introductionmentioning
confidence: 99%
“…This could be done by either assigning each utterance to multiple related clusters [30], or pre-segmenting utterances into small speakerhomogeneous regions and then clustering those regions. In parallel, speaker segmentation may be improved with the aid of speaker clustering [31]. Specifically, speech segments assigned to each cluster can be used to train a speaker-related model, thereby examining the speaker change boundaries of an audio recording in a manner of frame-by-frame recognition.…”
Section: Discussionmentioning
confidence: 99%