2018
DOI: 10.1007/978-3-319-99579-3_13
|View full text |Cite
|
Sign up to set email alerts
|

A Free Synthetic Corpus for Speaker Diarization Research

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
3
1
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(4 citation statements)
references
References 40 publications
0
4
0
Order By: Relevance
“…Samples of the classification were verified by listening to the recording with the patient’s voice panned to the left and the therapist’s voice panned to the right headphone loudspeaker. To assess the quality of the performed silence detection and diarization, we performed a test on a free synthetic speech corpus (Edwards et al, 2018), which showed error rates for our method around 5% on average. Comparison with a manually coded 30-min extract from one of our therapies revealed an interrater reliability (Cohen’s κ) of .80 for identification of silence and correct diarization.…”
Section: Methodsmentioning
confidence: 99%
“…Samples of the classification were verified by listening to the recording with the patient’s voice panned to the left and the therapist’s voice panned to the right headphone loudspeaker. To assess the quality of the performed silence detection and diarization, we performed a test on a free synthetic speech corpus (Edwards et al, 2018), which showed error rates for our method around 5% on average. Comparison with a manually coded 30-min extract from one of our therapies revealed an interrater reliability (Cohen’s κ) of .80 for identification of silence and correct diarization.…”
Section: Methodsmentioning
confidence: 99%
“…Because the method requires only a specially-constructed dataset, it can be used equally to evaluate other diarization components [29] or end-to-end systems that are otherwise resistant to introspection. Future work might also further explore the relationship of conversation characteristics with accuracy by manipulating specific conversation characteristics in synthetic structures, such as the rate of turn changes, amount of speaker overlap, and number of speakers [3,4,7].…”
Section: Discussionmentioning
confidence: 99%
“…The two versions will not occur in a natural speech corpus. Instead of using natural conversations, both versions are constructed by splicing source audio from the two speakers according to the desired structure [7]. If there are more than two speakers and roles, then all factorial combinations of speakers and roles can be generated.…”
Section: Version 1: a A A B B A A Version 2: A A A B B A Amentioning
confidence: 99%
“…The supervised diarization method is tested on the EMRAI Synthetic Diarization Corpus (Edwards et al, 2018). This corpus is based on the LibriSpeech Corpus (Panayotov et al, 2015), namely recordings of English audiobooks.…”
Section: Emrai Synthetic Diarization Corpusmentioning
confidence: 99%