Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security 2021
DOI: 10.1145/3460120.3485389
|View full text |Cite
|
Sign up to set email alerts
|

CapSpeaker: Injecting Voices to Microphones via Capacitors

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 16 publications
(4 citation statements)
references
References 39 publications
0
4
0
Order By: Relevance
“…Thus, inaudible attacks [5], [21], [22] have been proposed, which exploit carrier signals outside the audible frequencies of human beings (e.g., 40 kHz) to inject attacks into ASR systems utilizing the nonlinearity vulnerability of microphone circuits, yet entirely unheard by victims. However, compared with audible playback speech samples, such attacks usually suffer from signal distortion and low SNR due to their dependence on various convert channels, e.g., ultrasound [57], laser [6], or electricity [24] signals, and the hardware imperfections these channels introduce. There is also a major branch of the research community that leverages the vulnerability of ASR models by adding slightly audible perturbations on the benign audio based on ϵ-constraint [7], [58] and psychoacoustic hiding [3], [4], to make the AEs sound benign but fool the ASR's transcription.…”
Section: Custom Adversarial Examples and Inaudible Attacksmentioning
confidence: 99%
“…Thus, inaudible attacks [5], [21], [22] have been proposed, which exploit carrier signals outside the audible frequencies of human beings (e.g., 40 kHz) to inject attacks into ASR systems utilizing the nonlinearity vulnerability of microphone circuits, yet entirely unheard by victims. However, compared with audible playback speech samples, such attacks usually suffer from signal distortion and low SNR due to their dependence on various convert channels, e.g., ultrasound [57], laser [6], or electricity [24] signals, and the hardware imperfections these channels introduce. There is also a major branch of the research community that leverages the vulnerability of ASR models by adding slightly audible perturbations on the benign audio based on ϵ-constraint [7], [58] and psychoacoustic hiding [3], [4], to make the AEs sound benign but fool the ASR's transcription.…”
Section: Custom Adversarial Examples and Inaudible Attacksmentioning
confidence: 99%
“…Researchers have utilized different kinds of physical signals such as electromagnetic, ultrasonic, and light signals in sensor attacks on smart voice assistants [28], [16], [17], [29], [30], [25], [31], [32], [33]. These attacks explored the physicallevel risks of exploiting sensors by transmitting determined signals (e.g., recorded voice) modulated in specific, out-ofband carriers to maliciously trigger an event in the victim system.…”
Section: Related Workmentioning
confidence: 99%
“…Inaudible attacks modulate the audio baseband on highfrequency carriers to the inaudible band of human ears (>20 kHz) and exploit microphones' nonlinear vulnerability, so that ASRs can receive the malicious audio while humans cannot perceive it. Recently, inaudible attacks have been extended from ultrasonic carrier [5], [21] to various forms, such as solid conduction [22], laser [6], capacitor [23], power line [24], etc., forming a class of highly threatening and comprehensive covert attacks. We take the representative ultrasound-based attack [5] to present the principle of inaudible attacks shown in Fig.…”
Section: Ultrasound-based Attacksmentioning
confidence: 99%
“…Thus, inaudible attacks [5], [21], [22] have been proposed, which exploit carrier signals outside the audible frequencies of human beings (e.g., 40 kHz) to inject attacks into ASR systems utilizing the nonlinearity vulnerability of microphone circuits, yet entirely unheard by victims. However, compared with audible playback speech samples, such attacks usually suffer from signal distortion and low SNR due to their dependence on various convert channels, e.g., ultrasound [57], laser [6], or electricity [24] signals, and the hardware imperfections these channels introduce. There is also a major branch of the research community that leverages the vulnerability of ASR models by adding slightly audible perturbations on the benign audio based on ϵ-constraint [7], [58] and psychoacoustic hiding [3], [4], to make the AEs sound benign but fool ASR's transcription.…”
Section: Custom Adversarial Examples and Inaudible Attacksmentioning
confidence: 99%