2019
DOI: 10.48550/arxiv.1905.12230
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Guided Source Separation Meets a Strong ASR Backend: Hitachi/Paderborn University Joint Investigation for Dinner Party ASR

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
5
1

Relationship

2
4

Authors

Journals

citations
Cited by 7 publications
(3 citation statements)
references
References 27 publications
0
3
0
Order By: Relevance
“…Multi-array GSS [13,30] was applied to enhance target speaker speech signals. For track 1, we used oracle speech segmentations and speaker labels, while for track 2, we used the segmentation estimated by the speaker diarization module described in Section 3.…”
Section: Guided Source Separation (Gss)mentioning
confidence: 99%
“…Multi-array GSS [13,30] was applied to enhance target speaker speech signals. For track 1, we used oracle speech segmentations and speaker labels, while for track 2, we used the segmentation estimated by the speaker diarization module described in Section 3.…”
Section: Guided Source Separation (Gss)mentioning
confidence: 99%
“…Due to its importance in the front end of speech signal processing, speech separation has been an important research direction in academic and industry fields. It has derived a series of cutting-edge applications in ASR (Automatic Speech Recognition) [28,29], SED (Sound Event Detection) [30,31,32] and other areas, such as call customer service channels [33], multi-speaker meeting minutes [34] and target instruction extraction of smart speakers in domestic scene [35]. Speech enhancement and speech separation methods have undergone a long development.…”
Section: Introductionmentioning
confidence: 99%
“…Recently, the multi-channel speech separation achieves good performance [13,14] and has been successfully integrated into conversation transcription systems [15]. However, the improvement has still been limited with single channel input for the conversational tasks [16,17,18].…”
Section: Introductionmentioning
confidence: 99%