Novel Variable Length Energy Separation Algorithm Using Instantaneous Amplitude Features for Replay Detection

In recent years, automatic speaker verification (ASV) is used extensively for voice biometrics. This leads to an increased interest to secure these voice biometric systems for real-world applications. The ASV systems are vulnerable to various kinds of spoofing attacks, namely, synthetic speech (SS), voice conversion (VC), replay, twins, and impersonation. This paper provides the literature review of ASV spoof detection, novel acoustic feature representations, deep learning, end-to-end systems, etc. Furthermore, the paper also summaries previous studies of spoofing attacks with emphasis on SS, VC, and replay along with recent efforts to develop countermeasures for spoof speech detection (SSD) task. The limitations and challenges of SSD task are also presented. While several countermeasures were reported in the literature, they are mostly validated on a particular database, furthermore, their performance is far from perfect. The security of voice biometrics systems against spoofing attacks remains a challenging topic. This paper is based on a tutorial presented at APSIPA Annual Summit and Conference 2017 to serve as a quick start for those interested in the topic.

show abstract

“…The short-time AM-FM features set obtained using Energy Separation Algorithm (ESA) were studied in [120,121] as shown in Fig. 11.…”

Section: ) Acoustic Featuresmentioning

confidence: 99%

Advances in anti-spoofing: from the perspective of ASVspoof challenges

Kamble

Sailor

Patil

et al. 2020

SIP

View full text Add to dashboard Cite

show abstract

“…• Discrete Fourier transform (DFT) based features: which include Mel frequency cepstral coefficients (MFCC) [4,13,36], mel filterbank slope [10], linear filterbak slope [10], and Q-log domain DFT-based mean normalized log spectral [42]. • Variable length energy separation algorithm (VESA)-based features: which include instantaneous frequency cosine coefficients based on VESA [6] and instantaneous amplitude cosine coefficients based on VESA [43].…”

Section: Related Workmentioning

confidence: 99%

Discriminative features based on modified log magnitude spectrum for playback speech detection

Yang

Ren³

et al. 2020

J AUDIO SPEECH MUSIC PROC.

View full text Add to dashboard Cite

In order to improve the performance of hand-crafted features to detect playback speech, two discriminative features, constant-Q variance-based octave coefficients and constant-Q mean-based octave coefficients, are proposed for playback speech detection in this work. They rely on our findings that variance-based modified log magnitude spectrum and mean-based modified log magnitude spectrum can enhance the discriminative power between genuine speech and playback speech. Then constant-Q variance-based octave coefficients (constant-Q mean-based octave coefficients) can be obtained by combining variance-based modified log magnitude spectrum (mean-based modified log magnitude spectrum), octave segmentation, and discrete cosine transform. Finally, constant-Q variance-based octave coefficients and constant-Q mean-based octave coefficients are evaluated on ASVspoof 2017 corpus version 2.0 and ASVspoof 2019 physical access, respectively. Experimental results show that variance-based modified log magnitude spectrum and mean-based modified log magnitude spectrum can produce discriminative features toward playback speech. Further results on the two databases show that constant-Q variance-based octave coefficients and constant-Q mean-based octave coefficients can perform better than some common features, such as mel frequency cepstral coefficients and constant-Q cepstral coefficients.

show abstract

“…When speech is replayed through a playback device, or recorded on a recording device, its frequency attributes are changed [7]- [12]. Replay attack detection can be regarded as a task that distinguishes the difference in the frequency attributes between genuine and replayed speeches.…”

Section: Introductionmentioning

confidence: 99%

A New Replay Attack Against Automatic Speaker Verification Systems

et al. 2020

View full text Add to dashboard Cite

With the increasing popularity of automatic speaker verification (ASV), the reliability of ASV systems has also gained importance. ASV is vulnerable to various spoofing attacks, especially replay attacks. Thus, recent public competitions and studies based on spoofing attack detection for ASV have mainly focused on the detection of replay attacks. Generally, replayed speech includes the attributes of one playback and two recording devices: the playback device, the recording device used by the attacker, and the recording device embedded in any system to verify input utterances. Therefore, the main attributes differentiating a replayed speech from the genuine speech are the attributes of the playback and the recording devices used by the attacker. In this paper, we propose a novel replay attack and its defense through observation of the general speech-spoofing process. The proposed attack includes only the attribute of one recording device embedded in an ASV system; genuine speech passes through the recording device only once, and the replayed speech produced for the proposed attack passes through the same recording device twice. Because the proposed attack is feasible, it can be considered a new task for replay countermeasures in the training process in order to develop a robust ASV protection system. The experimental results show that this novel replay attack cannot be detected by several of the existing state-of-the-art replay attack detection systems. Furthermore, the new attack can be detected by the same systems successfully if they are retrained with an appropriate dataset designed for the new task. INDEX TERMS Automatic speaker verification, replay attack, same recording device, spoofing detection.

show abstract

Novel Variable Length Energy Separation Algorithm Using Instantaneous Amplitude Features for Replay Detection

Cited by 19 publications

References 33 publications

Advances in anti-spoofing: from the perspective of ASVspoof challenges

Advances in anti-spoofing: from the perspective of ASVspoof challenges

Discriminative features based on modified log magnitude spectrum for playback speech detection

A New Replay Attack Against Automatic Speaker Verification Systems

Contact Info

Product

Resources

About