Shihono Mochizuki scite author profile

Shihono Mochizuki

2Publications

4Citation Statements Received

20Citation Statements Given

How they've been cited

How they cite others

Affiliations

Publications

Order By: Most citations

Voice livness detection based on pop-noise detector with phoneme information for speaker verification

Mochizuki¹,

Shiota²,

Kiya³

2016

View full text Add to dashboard Cite

This paper proposes a pop-noise detector using phoneme information for a voice liveness detection (VLD) framework. In recent years, spoofing attacks (e.g., reply, speech synthesis, and voice conversion) have become a serious problem against speaker verification systems. Some techniques have been proposed to protect the speaker verification systems from these spoofing attacks. The VLD framework has been proposed as one of fundamental solutions. The VLD framework identifies that an input sample is uttered by an actual human or played by a loudspeaker. To realize the VLD framework, pop-noise detection methods have been proposed and these methods perform well as the VLD module. However, since pop-noise is a common distortion in speech that occurs when a speaker’s breath reaches a microphone, the phenomenon of pop-noise is able to be occurred by winds or attackers arbitrary. It is one problem of the pop-noise detection methods. In order to improve the robustness of the pop-noise detection methods, this paper proposes a pop-noise detector using phoneme information as an evidence of an actual human. From the experimental results, the proposed method increases the robustness of the VLD against spoofing attacks.

show abstract

Voice liveness detection using phoneme-based pop-noise detector for speaker verification

Mochizuki¹,

Shiota²,

Kiya³

2018

View full text Add to dashboard Cite

This paper proposes a phoneme-based pop-noise (PN) detection algorithm for voice liveness detection (VLD) and automatic speaker verification systems. Recently, a lot of countermeasures against spoofing attacks (e.g., replay, speech synthesis) have been reported for speaker verification systems. A principle mechanism of almost all spoofing attacks is to replay recorded speeches via a loudspeaker. Therefore, one of the effective solutions against spoofing attacks is to determine whether an input speech is a genuine voice or a replayed one, and this is a framework of VLD. To realize the VLD framework, PN detection methods have been proposed. Since PN is a common distortion that occurs when speaker's breath reaches the inside of a microphone, the conventional PN detection methods simply capture PN periods during the input speech. However, the performances of the PN detection methods depend on microphone types and phrases. It may lead to vulnerability of the conventional PN detection methods. This paper proposes a novel PN detection method, focused on specific characteristics of phonemes related to the PN phenomenon. The experimental results show that the proposed method provides a higher performance than conventional PN detection methods.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.