Impact of overlapping speech detection on speaker diarization for broadcast news and debates

Charlet, Delphine; Barras, Claude; Liénard, Jean‐Sylvain

doi:10.1109/icassp.2013.6639163

Cited by 27 publications

(19 citation statements)

References 20 publications

(18 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…https://github.com/jsalt2019-diadet/jsalt2019-diadet3 Thanks to Claude Barras for providing the overlapped speech detection output corresponding to system L 1 inTable 2of[20], and Marie Kunešová for providing the overlapped speech detection output corresponding to system "AMI test (all subsets) + dereverberation" inTable 2of[8].…”

mentioning

confidence: 99%

Overlap-Aware Diarization: Resegmentation Using Neural End-to-End Overlapped Speech Detection

Bullock

Bredin

García-Perera

2020

ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

View full text Add to dashboard Cite

We address the problem of effectively handling overlapping speech in a diarization system. First, we detail a neural Long Short-Term Memory-based architecture for overlap detection. Secondly, detected overlap regions are exploited in conjunction with a frame-level speaker posterior matrix to make two-speaker assignments for overlapped frames in the resegmentation step. The overlap detection module achieves state-of-the-art performance on the AMI, DIHARD, and ETAPE corpora. We apply overlap-aware resegmentation on AMI, resulting in a 20% relative DER reduction over the baseline system. While this approach is by no means an endall solution to overlap-aware diarization, it reveals promising directions for handling overlap.

show abstract

mentioning

confidence: 99%

Overlap-Aware Diarization: Resegmentation Using Neural End-to-End Overlapped Speech Detection

Bullock

Bredin

García-Perera

2020

ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

View full text Add to dashboard Cite

show abstract

“…pyannote.audio provides a set of command line tools for training, validation, and application of modules listed in 1 x-vector aficionados would still suggest to use PLDA anyway... Table 3. Evaluation of pre-trained overlapped speech detection models, in terms of precision (%) and recall (%).…”

Section: Reproducible Resultsmentioning

confidence: 99%

“…Among all pyannote.audio alternatives, it is the most similar: written in Python, it provides most of the afore-This research was partly funded by the French National Research Agency (ANR) through the ODESSA (ANR-15-CE39-0010) and PLUM-COT (ANR-16-CE92-0025) projects. We would like to thank Claude Barras for providing the overlapped speech detection output corresponding to system L 1 in Table 2 of [1], Neville Ryant for the speaker diarization output of the winning submission to DIHARD 2019 [2,3], Marie Kunešová for the overlapped speech detection output corresponding to system "AMI test (all subsets) + dereverberation" in Table 2 of [4], and Sylvain Meignier for the speaker diarization output of [5] on ETAPE dataset. mentioned blocks, and goes all the way down to the actual evaluation of the system.…”

Section: Introductionmentioning

confidence: 99%

Pyannote.Audio: Neural Building Blocks for Speaker Diarization

Bredin¹,

Yin²,

Coria³

et al. 2020

ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

192

128

View full text Add to dashboard Cite

We introduce pyannote.audio, an open-source toolkit written in Python for speaker diarization. Based on PyTorch machine learning framework, it provides a set of trainable end-to-end neural building blocks that can be combined and jointly optimized to build speaker diarization pipelines. pyannote.audio also comes with pre-trained models covering a wide range of domains for voice activity detection, speaker change detection, overlapped speech detection, and speaker embedding -reaching state-of-the-art performance for most of them.

show abstract

“…Speaker diarization can be useful for speaker verification with nonoverlapping multi-talker speech [1][2][3][4][5][6]. It can effectively exclude unwanted speech segments when the speakers only slightly overlap [7,8]. However, such system fails when multitalkers speak simultaneously most of the time.…”

Section: Introductionmentioning

confidence: 99%

Target Speaker Extraction for Multi-Talker Speaker Verification

Rao

Chng

et al. 2019

Interspeech 2019

View full text Add to dashboard Cite

The performance of speaker verification degrades significantly when the test speech is corrupted by interference from nontarget speakers. Speaker diarization separates speakers well only if the speakers are not overlapped. However, if multiple talkers speak at the same time, we need a technique to separate the speech in the spectral domain. In this paper, we study a way to extract the target speaker's speech from an overlapped multi-talker speech. Specifically, given some reference speech samples from the target speaker, the target speaker's speech is firstly extracted from the overlapped multi-talker speech, then the extracted speech is processed in the speaker verification system. Experimental results show that the proposed approach significantly improves the performance of overlapped multi-talker speaker verification and achieves 64.4% relative EER reduction over the zero-effort baseline.

show abstract

Impact of overlapping speech detection on speaker diarization for broadcast news and debates

Cited by 27 publications

References 20 publications

Overlap-Aware Diarization: Resegmentation Using Neural End-to-End Overlapped Speech Detection

Overlap-Aware Diarization: Resegmentation Using Neural End-to-End Overlapped Speech Detection

Pyannote.Audio: Neural Building Blocks for Speaker Diarization

Target Speaker Extraction for Multi-Talker Speaker Verification

Contact Info

Product

Resources

About