A Deeper Look at Gaussian Mixture Model Based Anti-Spoofing Systems

Chettri, Bhusan; Sturm, Bob L.

doi:10.1109/icassp.2018.8461467

Cited by 18 publications

(12 citation statements)

References 11 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…On this database, however, we find it hard to analyse these factors in isolation for two reasons: (1) Unavailability of meta-data for genuine recordings; (2) Segregating the three factors AE, PD and RD from a replayed signal is difficult. Our further analysis on frame-level energy and log-likelihood distributions shows existence of the cues in the genuine signals, similar to the findings of [12] on version 1.0 of the corpus.…”

Section: Introductionsupporting

confidence: 87%

“…The ASVspoof 2017 version 1.0 corpus [13] has been released as a part of the second automatic speaker verification spoofing and countermeasures challenge [14] designed to foster research in "replay spoofing" countermeasures. Post-evaluation, [12] demonstrated how class predictions could be manipulated using the cues present in some of the genuine audio recordings of the corpus. Subsequently, version 2.0 [11] has been released online 1 addressing these data anomalies.…”

Section: The Asvspoof 2017 Corpusmentioning

confidence: 99%

See 1 more Smart Citation

Analysing Replay Spoofing Countermeasure Performance Under Varied Conditions

Chettri

Sturm

Benetos

2018

2018 IEEE 28th International Workshop on Machine Learning for Signal Processing (MLSP)

Self Cite

View full text Add to dashboard Cite

In this paper, we aim to understand what makes replay spoofing detection difficult in the context of the ASVspoof 2017 corpus. We use FFT spectra, mel frequency cepstral coefficients (MFCC) and inverted MFCC (IMFCC) frontends and investigate different backends based on Convolutional Neural Networks (CNNs), Gaussian Mixture Models (GMMs) and Support Vector Machines (SVMs). On this database, we find that IMFCC frontend based systems show smaller equal error rate (EER) for high quality replay attacks but higher EER for low quality replay attacks in comparison to the baseline. However, we find that it is not straightforward to understand the influence of an acoustic environment (AE), a playback device (PD) and a recording device (RD) of a replay spoofing attack. One reason is the unavailability of metadata for genuine recordings. Second, it is difficult to account for the effects of the factors: AE, PD and RD, and their interactions. Finally, our frame-level analysis shows that the presence of cues (recording artefacts) in the first few frames of genuine signals (missing from replayed ones) influence class prediction.

show abstract

Section: Introductionsupporting

confidence: 87%

Section: The Asvspoof 2017 Corpusmentioning

confidence: 99%

Analysing Replay Spoofing Countermeasure Performance Under Varied Conditions

Chettri

Sturm

Benetos

2018

2018 IEEE 28th International Workshop on Machine Learning for Signal Processing (MLSP)

Self Cite

View full text Add to dashboard Cite

show abstract

“…Post-evaluation, the organisers became aware 6 of a number of data anomalies that have potential to influence results and find- ings [11]. These mostly involve periods of silence, or zerovalued samples that are present in the original RedDots data [7].…”

Section: Database Updatementioning

confidence: 99%

ASVspoof 2017 Version 2.0: meta-data analysis and baseline enhancements

Delgado

Todisco²,

Sahidullah

et al. 2018

The Speaker and Language Recognition Workshop (Odyssey 2018)

130

110

View full text Add to dashboard Cite

The now-acknowledged vulnerabilities of automatic speaker verification (ASV) technology to spoofing attacks have spawned interests to develop so-called spoofing countermeasures. By providing common databases, protocols and metrics for their assessment, the ASVspoof initiative was born to spearhead research in this area. The first competitive ASVspoof challenge held in 2015 focused on the assessment of countermeasures to protect ASV technology from voice conversion and speech synthesis spoofing attacks. The second challenge switched focus to the consideration of replay spoofing attacks and countermeasures. This paper describes Version 2.0 of the ASVspoof 2017 database which was released to correct data anomalies detected post-evaluation. The paper contains as-yet unpublished meta-data which describes recording and playback devices and acoustic environments. These support the analysis of replay detection performance and limits. Also described are new results for the official ASVspoof baseline system which is based upon a constant Q cesptral coefficient frontend and a Gaussian mixture model backend. Reported are enhancements to the baseline system in the form of log-energy coefficients and cepstral mean and variance normalisation in addition to an alternative i-vector backend. The best results correspond to a 48% relative reduction in equal error rate when compared to the original baseline system.

show abstract

“…Our results suggest that performance metrics reported on the current PA dataset may be overestimating the actual performance of the models, which might become somewhat of a "horse" [17] that trivially sidesteps the actual problem, thus raising concerns about model validity as well as performance results. Prior work has addressed a similar issue of silence on the ASVspoof 2017 PA dataset [18], which calls for careful design and validation of the 2019 PA spoofing dataset 2 .…”

Section: Introductionmentioning

confidence: 99%

Ensemble Models for Spoofing Detection in Automatic Speaker Verification

et al. 2019

Self Cite

View full text Add to dashboard Cite

Spectrograms -time-frequency representations of audio signals -have found widespread use in neural network-based spoofing detection. While deep models are trained on the fullband spectrum of the signal, we argue that not all frequency bands are useful for these tasks. In this paper, we systematically investigate the impact of different subbands and their importance on replay spoofing detection on two benchmark datasets: ASVspoof 2017 v2.0 and ASVspoof 2019 PA. We propose a joint subband modelling framework that employs n different sub-networks to learn subband specific features. These are later combined and passed to a classifier and the whole network weights are updated during training. Our findings on the ASVspoof 2017 dataset suggest that the most discriminative information appears to be in the first and the last 1 kHz frequency bands, and the joint model trained on these two subbands shows the best performance outperforming the baselines by a large margin. However, these findings do not generalise on the ASVspoof 2019 PA dataset. This suggests that the datasets available for training these models do not reflect real world replay conditions suggesting a need for careful design of datasets for training replay spoofing countermeasures.

show abstract

A Deeper Look at Gaussian Mixture Model Based Anti-Spoofing Systems

Cited by 18 publications

References 11 publications

Analysing Replay Spoofing Countermeasure Performance Under Varied Conditions

Analysing Replay Spoofing Countermeasure Performance Under Varied Conditions

ASVspoof 2017 Version 2.0: meta-data analysis and baseline enhancements

Ensemble Models for Spoofing Detection in Automatic Speaker Verification

Contact Info

Product

Resources

About