2006 IEEE Odyssey - The Speaker and Language Recognition Workshop 2006
DOI: 10.1109/odyssey.2006.248114
|View full text |Cite
|
Sign up to set email alerts
|

Speaker Diarization: About whom the Speaker is Talking ?

Abstract: The automatic speaker diarization consists in splitting the signal into homogeneous segments and clustering them by speakers. However the speaker segments are specified with anonymous labels. This paper proposed a solution to identify those speakers by extracting their full names pronounced in the show. With a semantic classification tree automatically built on a training corpus, the full names detected in transcription of a segment are associated to this segment or to one of its neighbors. Then, a merging met… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
25
0
1

Year Published

2009
2009
2020
2020

Publication Types

Select...
4
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 19 publications
(26 citation statements)
references
References 6 publications
0
25
0
1
Order By: Relevance
“…Our system to identify speakers by name is based on the use of SCT and it is presented in [5] and [7]. A SCT is used for each occurrence of full name detected in the transcripts.…”
Section: Baseline Systemmentioning
confidence: 99%
See 3 more Smart Citations
“…Our system to identify speakers by name is based on the use of SCT and it is presented in [5] and [7]. A SCT is used for each occurrence of full name detected in the transcripts.…”
Section: Baseline Systemmentioning
confidence: 99%
“…In all the previous papers dealing with named speaker identification, the results were presented in terms of duration [4,5,6,7,8]. That is to say that if a system is able to correctly name a speaker who speaks 90% of the time and miss the other six speakers who speak 10% of the time, it will have very good results (90% of recall and 90% of precision).…”
Section: Scoringmentioning
confidence: 99%
See 2 more Smart Citations
“…Note that a parallel approach for lexical SID in TV shows is to use lexical context around spoken names to classify the names between speaker, addressee and object [8,29,24]. On the contrary, this work does not depend on spoken names (hence neither on a Named Entity Recognizer), but rather analyzes the general lexical content of speech.…”
Section: Introductionmentioning
confidence: 99%