Interspeech 2022 2022
DOI: 10.21437/interspeech.2022-10251
|View full text |Cite
|
Sign up to set email alerts
|

Multi-View Attention Transfer for Efficient Speech Enhancement

Abstract: Recent deep learning models have achieved high performance in speech enhancement; however, it is still challenging to obtain a fast and low-complexity model without significant performance degradation. Previous knowledge distillation studies on speech enhancement could not solve this problem because their output distillation methods do not fit the speech enhancement task in some aspects. In this study, we propose multi-view attention transfer (MV-AT), a feature-based distillation, to obtain efficient speech en… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(1 citation statement)
references
References 26 publications
0
1
0
Order By: Relevance
“…To alleviate this issue, [16] proposed aligning intermediate features, while [17] used attention maps to do so. The latter was applied in the context of SE in [18] using considerably large, non-causal student models intended for offline applications. In [19], the authors addressed the dimensionality mismatch problem for the causal SE models by using frame-level Similarity Preserving KD [20] (SPKD).…”
Section: Introductionmentioning
confidence: 99%
“…To alleviate this issue, [16] proposed aligning intermediate features, while [17] used attention maps to do so. The latter was applied in the context of SE in [18] using considerably large, non-causal student models intended for offline applications. In [19], the authors addressed the dimensionality mismatch problem for the causal SE models by using frame-level Similarity Preserving KD [20] (SPKD).…”
Section: Introductionmentioning
confidence: 99%