Voice livness detection based on pop-noise detector with phoneme information for speaker verification

Mochizuki, Shihono; Shiota, Sayaka; Kiya, Hitoshi

doi:10.1121/1.4969520

Cited by 11 publications

(3 citation statements)

References 0 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

Section: Overall Performancementioning

confidence: 53%

“…We confirm the effectiveness of our system against replay attacks and impersonation attacks by comparing it with the baseline (VLD) [18] and the conference version of VoicePop [1]. In [18], Sayaka Shiota et al proposed the pop noise detector combined with the phoneme information to detect the existence of pop noises, but the replayed samples were easily recognized as legitimate samples under their proposed algorithm. However, VLD does not consider using the characteristics of the pop noise for further classification, nor does it consider the impersonation attack when the adversary replays the audio and mimics breathing at the √ Earise Al-101 √ This also shows that the combination of pop noise and its airflow pressure can improve the detection rate of the pop noise-only feature.…”

Section: Overall Performancementioning

confidence: 53%

“…Sayaka Shiota et al [17] proposed the pop noise detector, which combines the single-and the doublechannel to detect pop noise. They further incorporated the phoneme information for pop noise detection in [18]. However, their studies rely on the specific microphone model and cannot perform well when applied to mobile devices.…”

Section: Related Workmentioning

confidence: 99%

See 2 more Smart Citations

Securing Liveness Detection for Voice Authentication via Pop Noises

Jiang

Wang

Lin

et al. 2023

IEEE Trans. Dependable and Secure Comput.

View full text Add to dashboard Cite

Voice authentication has been increasingly adopted for sensitive operations on mobile devices. While voice biometrics can distinguish individuals by their spectral features (such as voiceprints), they are known to be prone to spoofing attacks, where malicious attackers can use pre-recorded or synthesized samples from legitimate users or impersonate the speaking style of the targeted user to deceive the voice authentication system. In this paper, we design and implement a novel software-only anti-spoofing system on smartphones. Our system leverages the pop noise, which is generated by the user's oral airflow when speaking the passphrase close to the microphone. The pop noise is delicate and subject to user diversity, making it hard to be recorded by replay attacks beyond a certain distance or to be imitated precisely by impersonators. Specifically, we design a new pop noise detection scheme to pinpoint pop noises at the phonemic level, based on which we establish a theoretical model to calculate the sound pressure level from the speech signal in order to get the estimated pressure signal, and then analyze the consistency with the actual pressure signal extracted from the pop noise. Furthermore, we calculate the similarity score of the unique sequences which describe the individually unique relationship between pop noises and phonemes to resist spoofing attacks. Our evaluation on a dataset of 30 participants and three smartphones shows that our system achieves over 94.79% accuracy. Our system requires no additional hardware and is robust to various factors including authentication angle, authentication distance, the length of passphrase, ambient noise, etc.

show abstract

Section: Overall Performancementioning

confidence: 53%