6th International Workshop on Speech Processing in Everyday Environments (CHiME 2020) 2020
DOI: 10.21437/chime.2020-1
|View full text |Cite
|
Sign up to set email alerts
|

CHiME-6 Challenge: Tackling Multispeaker Speech Recognition for Unsegmented Recordings

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

1
108
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
1
1

Relationship

2
5

Authors

Journals

citations
Cited by 149 publications
(110 citation statements)
references
References 0 publications
1
108
0
Order By: Relevance
“…It also seems interesting to be able to extend this type of study to new techniques proposed in the field of voice recognition, in which ideas based on the study of samples obtained in real noisy environments such as social gatherings, streets, cafes and restaurants are raised [108]. Likewise, an interesting line to take into account in this regard is given by the current challenges posed to address a voice recognition scenario capable of providing speech enhancement, speaker diarization and speech recognition modules, for example, by means of recognition modules based on multispeaker speech recognition for unsegmented recordings [109].…”
Section: Discussionmentioning
confidence: 99%
“…It also seems interesting to be able to extend this type of study to new techniques proposed in the field of voice recognition, in which ideas based on the study of samples obtained in real noisy environments such as social gatherings, streets, cafes and restaurants are raised [108]. Likewise, an interesting line to take into account in this regard is given by the current challenges posed to address a voice recognition scenario capable of providing speech enhancement, speaker diarization and speech recognition modules, for example, by means of recognition modules based on multispeaker speech recognition for unsegmented recordings [109].…”
Section: Discussionmentioning
confidence: 99%
“…In spontaneous human conversations different speakers tend to overlap with each other and, in meeting scenarios with more than two participants, the amount of overlapped speech can account for a significant portion of the total speech time, usually between 10% and 20% (McCowan et al, 2005;Watanabe et al, 2020). This phenomenon is one of the main obstacles towards fully reliable multi-party speech diarization (Ryant et al, 2018;García-Perera et al, 2020) and recognition (Watanabe et al, 2017;Vincent et al, 2018;Haeb-Umbach et al, 2019).…”
Section: Motivationmentioning
confidence: 99%
“…For this reason, Overlapped Speech Detection (OSD) is crucial to prevent back-end task performance degradation. This can be accomplished by including a reliable OSD algorithm together with Voice Activity Detection (VAD) in the very front-end part of the pipeline, possibly followed by speech separation (García-Perera et al, 2020;Watanabe et al, 2020). Speaker counting (Stöter et al, 2019) is a closely related task, which can be seen as an extension of VAD+OSD.…”
Section: Motivationmentioning
confidence: 99%
See 1 more Smart Citation
“…Speaker diarization has attracted attention because it can be used to boost the performance of ASR [23]. Motivated by the CHiME Challenges [24,25] and the DIHARD Challenges [26,27], several researchers have worked on developing more advanced speaker diarization system. Lin et al proposed a long short-term memory (LSTM)-based similarity measurement for the clustering-based speaker diarization.…”
Section: Related Workmentioning
confidence: 99%