2016 IEEE Spoken Language Technology Workshop (SLT) 2016
DOI: 10.1109/slt.2016.7846262
|View full text |Cite
|
Sign up to set email alerts
|

Further optimisations of constant Q cepstral processing for integrated utterance and text-dependent speaker verification

Abstract: Many authentication applications involving automatic speaker verification (ASV) demand robust performance using short-duration, fixed or prompted text utterances. Text constraints not only reduce the phone-mismatch between enrolment and test utterances, which generally leads to improved performance, but also provide an ancillary level of security. This can take the form of explicit utterance verification (UV). An integrated UV + ASV system should then verify access attempts which contain not just the expected … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
25
0

Year Published

2017
2017
2022
2022

Publication Types

Select...
5
4
1

Relationship

6
4

Authors

Journals

citations
Cited by 32 publications
(26 citation statements)
references
References 22 publications
1
25
0
Order By: Relevance
“…A standard Gaussian mixture model with universal background model (GMM-UBM) ASV system was used for ranking [13]. It uses a Mel-frequency cepstral coefficient (MFCC) front-end and a 512-component UBM trained using RSR2015 [14] and TIMIT 7 databases.…”
Section: Replay Configurationsmentioning
confidence: 99%
“…A standard Gaussian mixture model with universal background model (GMM-UBM) ASV system was used for ranking [13]. It uses a Mel-frequency cepstral coefficient (MFCC) front-end and a 512-component UBM trained using RSR2015 [14] and TIMIT 7 databases.…”
Section: Replay Configurationsmentioning
confidence: 99%
“…We chose a lightweight GMM-UBM-based system to conduct rapid parameter experimentation with computationally heavy 2DAR models. As demonstrated in [21], GMM-UBM provides a competitive accuracy on the RedDots data consisting of short utterances. Figure 4 presents the results for the speaker verification experiments on RedDots corpus in terms of EER.…”
Section: Features and Classifiermentioning
confidence: 92%
“…They include (i) a standard Mel-frequency cepstral coefficients (MFCC) [14] frontend and (ii) an infinite impulse response -constant Q, Melfrequency cepstral coefficients (ICMC) [15] frontend. The latter has been applied successfully to tasks including speaker recognition, utterance verification [15] and speaker diarization [9,16]. These features are similar to MFCC, but they replace the short-time Fourier transform by an infinite-impulse re- sponse, constant Q transform (IIR-CQT) [17].…”
Section: Feature Extractionmentioning
confidence: 99%