ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022
DOI: 10.1109/icassp43922.2022.9747141
|View full text |Cite
|
Sign up to set email alerts
|

Location-Based Training for Multi-Channel Talker-Independent Speaker Separation

Abstract: The performance of automatic speech recognition (ASR) systems severely degrades when multi-talker speech overlap occurs. In meeting environments, speech separation is typically performed to improve the robustness of ASR systems. Recently, location-based training (LBT) was proposed as a new training criterion for multi-channel talker-independent speaker separation. Assuming fixed array geometry, LBT outperforms widely-used permutation-invariant training in fully overlapped utterances and matched reverberant con… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
2
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 7 publications
(2 citation statements)
references
References 32 publications
0
2
0
Order By: Relevance
“…Recently, there has been a lot of exploration in the field of multi-party meetings scenarios [1,2,3,4,5]. Progress has also been advanced with several challenges [6,7,8,9,10,11] and datasets [12,13,14,15,16] specifically focusing on this field.…”
Section: Introductionmentioning
confidence: 99%
“…Recently, there has been a lot of exploration in the field of multi-party meetings scenarios [1,2,3,4,5]. Progress has also been advanced with several challenges [6,7,8,9,10,11] and datasets [12,13,14,15,16] specifically focusing on this field.…”
Section: Introductionmentioning
confidence: 99%
“…The prior studies have laid the foundation for recent progress. With the advent of deep learning, speech separation has seen major progress [18]- [28], even with different speaker overlapping ratios [29]- [37]. In a neural architecture, multiple speaker streams compete and segregate either with a masking or regression mechanism.…”
mentioning
confidence: 99%