2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2016
DOI: 10.1109/icassp.2016.7472652
|View full text |Cite
|
Sign up to set email alerts
|

End-to-end text-dependent speaker verification

Abstract: In this paper we present a data-driven, integrated approach to speaker verification, which maps a test utterance and a few reference utterances directly to a single score for verification and jointly optimizes the system's components using the same evaluation protocol and metric as at test time. Such an approach will result in simple and efficient systems, requiring little domainspecific knowledge and making few model assumptions. We implement the idea by formulating the problem as a single neural network arch… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

3
413
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
5
4

Relationship

0
9

Authors

Journals

citations
Cited by 476 publications
(416 citation statements)
references
References 20 publications
3
413
0
Order By: Relevance
“…The proposed method has a close connection to the softmax classifier on the class-center learning method. The objective function (5) aims to maximize the pAUC of the pairwise training set T t 1 at a mini-batch iteration, while the cross-entropy minimization with softmax aims to classify the t1 utterances that are used to construct the T t 1 . The class centers {wu} U u=1 are used for constructing T t 1 in the pAUC optimization, and used as the parameters of the softmax classifier in (7).…”
Section: Connection To Cross-entropy Minimization With Softmaxmentioning
confidence: 99%
“…The proposed method has a close connection to the softmax classifier on the class-center learning method. The objective function (5) aims to maximize the pAUC of the pairwise training set T t 1 at a mini-batch iteration, while the cross-entropy minimization with softmax aims to classify the t1 utterances that are used to construct the T t 1 . The class centers {wu} U u=1 are used for constructing T t 1 in the pAUC optimization, and used as the parameters of the softmax classifier in (7).…”
Section: Connection To Cross-entropy Minimization With Softmaxmentioning
confidence: 99%
“…ASV is undisputedly a crucial technology for biometric identification, which is broadly applied in real-world applications like banking and home automation. Considerable performance improvements in terms of both accuracy and efficiency of ASV systems have been achieved through active research in a diversity of approaches [1][2][3][4][5][6]. [4] proposed a method that use the Gaussian mixture model to extract acoustic features and then apply the likelihood ratio for scoring.…”
Section: Introductionmentioning
confidence: 99%
“…Haibin Wu and Hung-yi Lee were supported by the Ministry of Science and Technology of Taiwan. by [5] to improve verification accuracy and make the ASV model compact and efficient.…”
Section: Introductionmentioning
confidence: 99%
“…Given a test recording, the embedding for this recording is compared against the embeddings generated from the enrolment utterances using a suitable distance metric. Speaker verification algorithms can be characterised based on whether the phonetic content in the inputs is limited, which is known as text-dependent speaker verification [9]. Alternatively, textindependent systems operate with no restrictions on the phonetic content [3].…”
Section: Introductionmentioning
confidence: 99%