Interspeech 2019 2019
DOI: 10.21437/interspeech.2019-2108
|View full text |Cite
|
Sign up to set email alerts
|

Speaker-Aware Deep Denoising Autoencoder with Embedded Speaker Identity for Speech Enhancement

Abstract: Previous studies indicate that noise and speaker variations can degrade the performance of deep-learning-based speechenhancement systems. To increase the system performance over environmental variations, we propose a novel speaker-aware system that integrates a deep denoising autoencoder (DDAE) with an embedded speaker identity. The overall system first extracts embedded speaker identity features using a neural network model; then the DDAE takes the augmented features as input to generate enhanced spectra. Wit… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2

Citation Types

0
10
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
4
3

Relationship

1
6

Authors

Journals

citations
Cited by 25 publications
(11 citation statements)
references
References 31 publications
0
10
0
Order By: Relevance
“…In [77], a speaker-aware SE system was proposed and shown to provide improved performance. In this work, we conducted an additional set of experiments that integrate the speaker-aware techniques into the DAEME (termed DAEME-UAT (SA) ).…”
Section: Experiments On the Tmhint Datasetmentioning
confidence: 99%
See 1 more Smart Citation
“…In [77], a speaker-aware SE system was proposed and shown to provide improved performance. In this work, we conducted an additional set of experiments that integrate the speaker-aware techniques into the DAEME (termed DAEME-UAT (SA) ).…”
Section: Experiments On the Tmhint Datasetmentioning
confidence: 99%
“…In this work, we conducted an additional set of experiments that integrate the speaker-aware techniques into the DAEME (termed DAEME-UAT (SA) ). For DAEME-UAT (SA) , we included the speaker information in the decoder used on the same approach proposed in [77]. More specifically, we first extracted embedded speaker identity features using a pre-trained DNN model, which was trained to classify a frame-wise speech feature into a certain speaker identity.…”
Section: Experiments On the Tmhint Datasetmentioning
confidence: 99%
“…Bhat et al [1] enhanced dysarthric speech features to match that of healthy control speech by a time-delay neural network-based DDA. Chuang et al [14] integrate a DDA with an embedding speaker identity for speech enhancement.…”
Section: Introductionmentioning
confidence: 99%
“…However, the corresponding noise segment in the same environment need to be prepared to create the noise embedding through an embedding subnetwork and the process of speech enhancement during inference is not convenient. Some research related to speaker-aware [13,14] or signal-to-noise-ratio (SNR) aware [15] algorithms have also been proposed to improve the speech enhancement model's denoising performance.…”
Section: Introductionmentioning
confidence: 99%