2016
DOI: 10.1007/978-3-319-40663-3_10
|View full text |Cite
|
Sign up to set email alerts
|

Usage of DNN in Speaker Recognition: Advantages and Problems

Abstract: Abstract. In this paper we consider different approaches of artificial neural networks application for speaker recognition task. We investigated the performance of DNN application at different levels of speaker recognition system: i-vector extraction level and model Back-End level. Results of our study perform high efficiency of the proposed neural network based approaches for solving this problem. It is shown that the use of DNN technology at different levels increases the reliability of speaker recognition s… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
7
0

Year Published

2016
2016
2023
2023

Publication Types

Select...
4
2
2

Relationship

4
4

Authors

Journals

citations
Cited by 11 publications
(8 citation statements)
references
References 10 publications
0
7
0
Order By: Relevance
“…Most of the text-independent speaker recognition systems are based on the i-vector extraction framework. Typically, i-vector computation process can be decomposed into three stages: collection of sufficient statistics, extraction of i-vectors and a probabilistic linear discriminant analysis (PLDA) backend [2,1,4]. Sufficient statistics are collected by using a sequence of feature vectors, e.g.…”
Section: Baseline I-vectorsmentioning
confidence: 99%
See 1 more Smart Citation
“…Most of the text-independent speaker recognition systems are based on the i-vector extraction framework. Typically, i-vector computation process can be decomposed into three stages: collection of sufficient statistics, extraction of i-vectors and a probabilistic linear discriminant analysis (PLDA) backend [2,1,4]. Sufficient statistics are collected by using a sequence of feature vectors, e.g.…”
Section: Baseline I-vectorsmentioning
confidence: 99%
“…The i-vector framework has inspired deep learning system design in this field. Particularly, in studies [2,4] they use an ASR deep neural network (ASR DNN) to divide acoustic space into senone classes, and the classic total variability (TV) model is applied to discriminate between speakers in that space [1]. In such phonetic discriminative DNN-based systems two major techniques can be distinguished.…”
Section: Introductionmentioning
confidence: 99%
“…Nonetheless, this problem is gradually gaining attention from the deep learning perspective. Particularly, studies [2,4] make use of the ASR deep neural network (ASR DNN) in order to divide acoustic space into senone classes, and the classic total variability (TV) model is applied to discriminate between speakers in that space afterwards [1].…”
Section: Introductionmentioning
confidence: 99%
“…Nonetheless, this problem is gradually gaining attention from the deep learning perspective. Particularly, studies [2,4] make use of the ASR deep neural network (ASR DNN) in order to divide acoustic space into senone classes, and the classic total variability (TV) model is applied to discriminate between speakers in that space afterwards [1].…”
Section: Introductionmentioning
confidence: 99%