2015
DOI: 10.1186/s13636-015-0056-7
|View full text |Cite
|
Sign up to set email alerts
|

Deep neural network-based bottleneck feature and denoising autoencoder-based dereverberation for distant-talking speaker identification

Abstract: Deep neural network (DNN)-based approaches have been shown to be effective in many automatic speech recognition systems. However, few works have focused on DNNs for distant-talking speaker recognition. In this study, a bottleneck feature derived from a DNN and a cepstral domain denoising autoencoder (DAE)-based dereverberation are presented for distant-talking speaker identification, and a combination of these two approaches is proposed. For the DNN-based bottleneck feature, we noted that DNNs can transform th… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
19
0

Year Published

2015
2015
2024
2024

Publication Types

Select...
6
3
1

Relationship

1
9

Authors

Journals

citations
Cited by 52 publications
(20 citation statements)
references
References 49 publications
0
19
0
Order By: Relevance
“…Accordingly, accuracy levels of proposed deep neural networks (DNN) for speaker recognition (both verification and identification) are far surpassing previous state-of-the-art techniques. Recent examples include the use of embeddings obtained from convolutional neural networks (CNN) for speaker recognition in [1,2,3], the use of auto-encoder models for speaker identification in [4,5], and a number of cases utilizing ResNet for both speaker recognition and identification in [6,7].…”
Section: Introductionmentioning
confidence: 99%
“…Accordingly, accuracy levels of proposed deep neural networks (DNN) for speaker recognition (both verification and identification) are far surpassing previous state-of-the-art techniques. Recent examples include the use of embeddings obtained from convolutional neural networks (CNN) for speaker recognition in [1,2,3], the use of auto-encoder models for speaker identification in [4,5], and a number of cases utilizing ResNet for both speaker recognition and identification in [6,7].…”
Section: Introductionmentioning
confidence: 99%
“…In the future, we try to apply dereverberation methods [22,28,29,31] for distant-talking accent recognition and evaluate our proposed method on real-word distant-talking speech data.…”
Section: Discussionmentioning
confidence: 98%
“…Unlike in the speech recognition tasks where the DNNs are used to get enhanced features from noisy features, researchers more prefer to use a DNN or convolutional neural network (CNN) to generate noise robustness bottleneck feature directly in speaker verification tasks [185][186][187]. As shown in Figure 11, acoustic features or feature maps are used to train a DNN/CNN with a bottleneck layer which has less nodes and closes to the output layer.…”
Section: Speech Recognition and Verification For The Internet Ofmentioning
confidence: 99%