Explicit modelling of session variability for speaker verification

Vogt, Robbie; Sridharan, Sridha

doi:10.1016/j.csl.2007.05.003

Cited by 109 publications

(118 citation statements)

References 13 publications

(9 reference statements)

Supporting

Mentioning

116

Contrasting

Unclassified

Order By: Relevance

“…The subspace U is estimated using an expectation-maximization (EM) algorithm. For details of the ISV technique for face-verification, please refer to the works of Wallace et al [3] and Vogt et al [30]. When enrolling a new client, i, using a set of enrollment images (indexed by j), the latent variables x i,j and z i are estimated from the enrollment images, and finally, the client-specific supervector, c i , is computed as:…”

Section: Gmm-based Fr Using Inter-session Variability Modelingmentioning

confidence: 99%

Deeply vulnerable: a study of the robustness of face recognition to presentation attacks

2017

View full text Add to dashboard Cite

The vulnerability of deep-learning based face-recognition (FR) methods, to presentation attacks (PA) is studied in this paper. Recently proposed FR methods based on deep neural networks (DNN) have been shown to outperform most other methods by a significant margin. In a trustworthy face-verification system, however, maximizing recognition-performance alone is not sufficient -the system should also be capable of resisting various kinds of attacks, including presentation-attacks (PA). Previous experience has shown that the PA-vulnerability of FR systems tends to increase with face-verification accuracy. Using several publicly available PA datasets, we show that DNN-based FR systems compensate for variability between bona fide and PA samples, and tend to score them similarly, which makes such FR systems extremely vulnerable to PAs. Experiments show the vulnerability of the studied DNN-based FR systems to be consistently higher than 90%, and often higher than 98%.

show abstract

Section: Gmm-based Fr Using Inter-session Variability Modelingmentioning

confidence: 99%

Deeply vulnerable: a study of the robustness of face recognition to presentation attacks

2017

View full text Add to dashboard Cite

show abstract

“…Many techniques have been proposed with the most notable systems based on Gaussian mixture model (GMM), inter-session variability (ISV) modeling [10], joint factor analysis (JFA) [16], and i-vectors [11].…”

Section: Vulnerability Of Voice Biometricsmentioning

confidence: 99%

“…To demonstrate vulnerability of ASV systems to presentation attacks, we consider two systems based on inter-session variability (ISV) modeling [10] and ivectors [11], which are the state of the art speaker verification systems able to effectively deal with intra-class and inter-class variability. In these systems, voice activity detection is based on the modulation of the energy around 4Hz, the features include 20 mel-scale frequency coefficients (MFCC) and energy, with their first and second derivatives, and modeling was performed with 256 Gaussian components using 25 expectation-maximization (EM) iterations.…”

Section: Vulnerability Of Voice Biometricsmentioning

confidence: 99%

“…The score fusionbased systems integration allows to separate bona fide data of the valid users, who are trying to be verified by the system, from both presentation attacks and genuine data of the non-valid users or so-called zero-impostors. For ASV system, we adopt verification approaches based on inter-session variability (ISV) modeling [10] and i-vectors [11], as the state of the art systems for speaker verification.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Presentation attack detection in voice biometrics

Korshunov¹,

Marcel²

2017

User-Centric Privacy and Security in Biometrics

View full text Add to dashboard Cite

Recent years have shown an increase in both the accuracy of biometric systems and their practical use. The application of biometrics is becoming widespread with fingerprint sensors in smartphones, automatic face recognition in social networks and video-based applications, and speaker recognition in phone banking and other phone-based services. The popularization of the biometric systems, however, exposed their major flaw -high vulnerability to spoofing attacks [1]. A fingerprint sensor can be easily tricked with a simple glue-made mold, a face recognition system can be accessed using a printed photo, and a speaker recognition system can be spoofed with a replay of pre-recorded voice. The ease with which a biometric system can be spoofed demonstrates the importance of developing efficient anti-spoofing systems that can detect both known (conceivable now) and unknown (possible in the future) spoofing attacks.Therefore, it is important to develop mechanisms that can detect such attacks, and it is equally important for these mechanisms to be seamlessly integrated into existing biometric systems for practical and attack-resistant solutions. To be practical, however, an attack detection should have (i) high accuracy, (ii) be well-generalized for different attacks, and (iii) be simple and efficient.One reason for the increasing demand for effective presentation attack detection (PAD) systems is the ease of access to people's biometric data. So often, a potential attacker can almost effortlessly obtain necessary biometric samples from social networks, including facial images, audio and video recordings, and even extract fingerprints from high resolution images. Therefore, various privacy protection solutions, such as legal privacy requirements and algorithms for obfuscating personal information, e.g., visual privacy filters [2], as well as, social awareness of threats to privacy can also increase security of personal information and potentially reduce the vulnerability of biometric systems.In this chapter, however, we focus on presentation attacks detection in voice biometrics, i.e., automatic speaker verification (ASV) systems. We discuss vulnerabilities of these systems to presentation attacks (PAs), present different state of the art 1

show abstract

“…The most commonly used acoustic vectors are Mel Frequency Cepstral Coefficients (MFCC), Linear Prediction Cepstral Coefficients (LPCC) and Perceptual Linear Prediction Cepstral (PLPC) Coefficients and zero crossing coefficients (Yegnanarayana et al, 2005;Vogt et al, 2005). All these features are based on the spectral information derived from a short time windowed segment of speech.…”

Section: Literature Reviewmentioning

confidence: 99%

Feature Extraction Techniques

Computer-Based Design and Manufacturing

View full text Add to dashboard Cite

Problem statement:This study introduces a new method for speaker verification system by fusing two different feature extraction methods to improve the recognition accuracy and security. Approach: The proposed system uses Mel frequency cepstral coefficients for speaker identification and Modified MFCC for verification. For speaker modeling vector quantization is used. Results: The proposed system was investigated the effect of the different length segmental feature as well as speaker modeling for speaker recognition. The performance was evaluated against 1000 speakers for 10 different languages with duration of 10 sec for training the system and for testing 5 sec. duration samples were used. Conclusion/Recommendations: Experimental results of the proposed system showed that higher recognition accuracy of 93% is achieved by increasing the number of filter banks used for feature extraction method, more competitive with existing system using vector quantization with lesser computational complexity. The system efficiency may further be improved using other speaker modeling techniques like GMM, HMM.

show abstract

Explicit modelling of session variability for speaker verification

Cited by 109 publications

References 13 publications

Deeply vulnerable: a study of the robustness of face recognition to presentation attacks

Deeply vulnerable: a study of the robustness of face recognition to presentation attacks

Presentation attack detection in voice biometrics

Feature Extraction Techniques

Contact Info

Product

Resources

About