2019
DOI: 10.18178/ijmlc.2019.9.1.760
|View full text |Cite
|
Sign up to set email alerts
|

Speaker Verification Using Deep Neural Networks: A Review

Abstract: Speaker verification involves examining the speech signal to authenticate the claim of a speaker as true or false. Deep neural networks are one of the successful implementation of complex non-linear models to learn unique and invariant features of data. They have been employed in speech recognition tasks and have shown their potential to be used for speaker recognition also. In this study, we investigate and review Deep Neural Network (DNN) techniques used in speaker verification systems. DNN are used from ext… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
14
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 23 publications
(14 citation statements)
references
References 34 publications
0
14
0
Order By: Relevance
“…Adopting the modified MFCC feature vector has improved the recognition accuracy by 2.43% approximately. A huge research effort has been devoted to developing speaker modeling, including Gaussian mixture model (GMM) [ 18 ], GMM supervector with support vector machine (SVM) [ 19 ], and i-vector system [ 20 , 21 ], deep learning systems [ 6 , 22 , 23 ], normalization [ 24 , 25 ] and channel compensation techniques [ 26 ], and adaptation techniques [ 27 , 28 ] to reduce the effect of these variations on the performance of the speaker recognition system. Impressive progress has been achieved in addressing external-based source of variations, particularly channel variations and environmental and background distortion [ 9 ].…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Adopting the modified MFCC feature vector has improved the recognition accuracy by 2.43% approximately. A huge research effort has been devoted to developing speaker modeling, including Gaussian mixture model (GMM) [ 18 ], GMM supervector with support vector machine (SVM) [ 19 ], and i-vector system [ 20 , 21 ], deep learning systems [ 6 , 22 , 23 ], normalization [ 24 , 25 ] and channel compensation techniques [ 26 ], and adaptation techniques [ 27 , 28 ] to reduce the effect of these variations on the performance of the speaker recognition system. Impressive progress has been achieved in addressing external-based source of variations, particularly channel variations and environmental and background distortion [ 9 ].…”
Section: Related Workmentioning
confidence: 99%
“…A speaker identification system for social human–robot interaction, which is the main motivation of this research, should be able to extract a voice signature from unconstrained utterances while maintains its performance at challenging scenarios including various low signal-to-noise ratio and short length of utterance as well as different types of noise. However, most of the state-of-the-art speaker recognition systems, including those based on deep learning algorithms, have achieved significant performance on verification tasks which are not suitable for social human–robot interaction [ 4 , 5 , 6 ]. In speaker diarisation, which is a key feature for social robots, also known as “who spoke when”, a speech signal is partitioned into homogenous segments according to speaker identity [ 7 ].…”
Section: Introductionmentioning
confidence: 99%
“…First, this paper focuses on the recently development of deep learning based speaker recognition techniques which achieved the state-of-the-art performance in many situations, while most previous overviews are based on traditional speaker recognition methods [1,4,[17][18][19][20][21]. Although the papers [22,23] summarized the deep learning based speaker recognition methods in certain aspects, our paper summarized different subtasks and topics of speaker recognition from new perspectives. Specifically, [22] presents an overview to the potential threats of adversarial attacks to speaker verification as well as the spoofing countermeasures, which is not the focus of this overview.…”
Section: Introductionmentioning
confidence: 99%
“…Specifically, [22] presents an overview to the potential threats of adversarial attacks to speaker verification as well as the spoofing countermeasures, which is not the focus of this overview. We provide a broad and comprehensive overview to a wide aspects of speaker verification, speaker diarization and domain adaptation etc, most of which have not be mentioned in [23].…”
Section: Introductionmentioning
confidence: 99%
“…Recently, some techniques for speaker verification have used deep neural networks (DNNs). In [15], a revision containing nine different techniques by employing DNN was presented. The results hit 0.2% and 0.88% for comparisons based on a dependent and independent text, respectively.…”
Section: Introductionmentioning
confidence: 99%