2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2016
DOI: 10.1109/icassp.2016.7472647
|View full text |Cite
|
Sign up to set email alerts
|

Audio enhancing with DNN autoencoder for speaker recognition

Abstract: In this paper we present a design of a DNN-based autoencoder for speech enhancement and its use for speaker recognition systems for distant microphones and noisy data. We started with augmenting the Fisher database with artificially noised and reverberated data and trained the autoencoder to map noisy and reverberated speech to its clean version. We use the autoencoder as a preprocessing step in the later stage of modelling in state-of-the-art text-dependent and text-independent speaker recognition systems. We… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
29
0

Year Published

2016
2016
2021
2021

Publication Types

Select...
4
4
1

Relationship

0
9

Authors

Journals

citations
Cited by 56 publications
(29 citation statements)
references
References 17 publications
0
29
0
Order By: Relevance
“…To compensate for these adverse impacts, various approaches have been proposed at different stages of the ASV systems. At the signal level, DNN based speech or feature enhancement [4,5,6,7] has been investigated for ASV under complex environment. At the feature level, feature normalization techniques [8] and noise-robust features such as power-normalized cepstral coefficients (PNCC) [9] have also been applied to ASV systems.…”
Section: Introductionmentioning
confidence: 99%
“…To compensate for these adverse impacts, various approaches have been proposed at different stages of the ASV systems. At the signal level, DNN based speech or feature enhancement [4,5,6,7] has been investigated for ASV under complex environment. At the feature level, feature normalization techniques [8] and noise-robust features such as power-normalized cepstral coefficients (PNCC) [9] have also been applied to ASV systems.…”
Section: Introductionmentioning
confidence: 99%
“…Future work will involve the automation of such manual processes, perhaps through the use of speaker diarization and/or audio enhancement (e.g. see [31]), as well as the expansion of the types of issues addressed in the system such as shortduration modeling [32] and signal quantization issues.…”
Section: Error Analysismentioning
confidence: 99%
“…A common way to improve the robustness of the system is to train the system using a dataset consisting of clean and noisy data [14]. Speech enhancement is another way of denoising such as short-time spectral amplitude minimum mean square error (STSA-MMSE) [15] and many DNN-based enhancement methods [16,17,18]. Unlike previous works denoising in the front-end, we plan to use a multi-task adversarial framework to extract the noise-robust speaker representation directly.…”
Section: Introductionmentioning
confidence: 99%