2022
DOI: 10.48550/arxiv.2203.16843
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

A Hybrid Continuity Loss to Reduce Over-Suppression for Time-domain Target Speaker Extraction

Abstract: Speaker extraction algorithm extracts the target speech from a mixture speech containing interference speech and background noise. The extraction process sometimes over-suppresses the extracted target speech, which not only creates artifacts during listening but also harms the performance of downstream automatic speech recognition algorithms. We propose a hybrid continuity loss function for time-domain speaker extraction algorithms to settle the over-suppression problem. On top of the waveform-level loss used … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 32 publications
(51 reference statements)
0
1
0
Order By: Relevance
“…We use the standard form of SI-SDR where the target signal is scaled to match the scale of the estimated signal. Also, we scale the magnitude loss by the L 1 norm of the magnitude of the target signal in the STFT domain similar to [24]. These loss functions are defined below…”
Section: G Loss Functionsmentioning
confidence: 99%
“…We use the standard form of SI-SDR where the target signal is scaled to match the scale of the estimated signal. Also, we scale the magnitude loss by the L 1 norm of the magnitude of the target signal in the STFT domain similar to [24]. These loss functions are defined below…”
Section: G Loss Functionsmentioning
confidence: 99%