2008
DOI: 10.1109/icassp.2008.4518619
|View full text |Cite
|
Sign up to set email alerts
|

Overlapped speech detection for improved speaker diarization in multiparty meetings

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
73
0

Year Published

2010
2010
2021
2021

Publication Types

Select...
5
2
2

Relationship

0
9

Authors

Journals

citations
Cited by 99 publications
(85 citation statements)
references
References 2 publications
0
73
0
Order By: Relevance
“…In [100] a real overlap detection system was developed, as well as a better heuristic that computed posterior probabilities from diarization to post process the output and include a second speaker on overlap regions. The main bottleneck of the achieved performance gain is mainly due to errors in overlap detection, and more work on enhancing its precision and recall is reported in [99], [101]. The main approach consists of a three state HMM-GMM system (non-speech, non-overlapped speech, and overlapped speech), and the best feature combination is MFCC and modulation spectrogram features [103], although comparable results were achieved with other features such as root mean squared energy, spectral flatness, or harmonic energy ratio.…”
Section: Overlap Detectionmentioning
confidence: 99%
See 1 more Smart Citation
“…In [100] a real overlap detection system was developed, as well as a better heuristic that computed posterior probabilities from diarization to post process the output and include a second speaker on overlap regions. The main bottleneck of the achieved performance gain is mainly due to errors in overlap detection, and more work on enhancing its precision and recall is reported in [99], [101]. The main approach consists of a three state HMM-GMM system (non-speech, non-overlapped speech, and overlapped speech), and the best feature combination is MFCC and modulation spectrogram features [103], although comparable results were achieved with other features such as root mean squared energy, spectral flatness, or harmonic energy ratio.…”
Section: Overlap Detectionmentioning
confidence: 99%
“…Doing so adversely affects the purity of speaker models, which ultimately reduces diarization performance. Approaches to overlap detection were thoroughly assessed in [97], [98] and, even whilst applied to ASR as opposed to speaker diarization, only a small number of systems actually detects overlapping speech well enough to improve error rates [99]- [101].…”
Section: Overlap Detectionmentioning
confidence: 99%
“…Although we feel that our approach is promising, we clearly need to perform more research to improve our overlap detection system. Note that a number of other research institutes currently also investigate the overlapping speech problem [12], [16].…”
Section: A Top-down Analysismentioning
confidence: 99%
“…This is due to the high degree of overlapping speech in this dataset (13.6% for RT'09 cf. 7.6% for RT'07) which is well known to have a significant impact on the performance of state-of-the-art speaker diarization systems [41]. Speaker diarization performance using a top-down approach is illustrated on row 5 of Table II.…”
Section: Diarization Performancementioning
confidence: 99%