CapSpeaker: Injecting Voices to Microphones via Capacitors

Ji, Xiaoyu; Zhang, Juchuan; Jiang, Shui; Li, Jishen; Xu, Wenyuan

doi:10.1145/3460120.3485389

Cited by 16 publications

(4 citation statements)

References 39 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Thus, inaudible attacks [5], [21], [22] have been proposed, which exploit carrier signals outside the audible frequencies of human beings (e.g., 40 kHz) to inject attacks into ASR systems utilizing the nonlinearity vulnerability of microphone circuits, yet entirely unheard by victims. However, compared with audible playback speech samples, such attacks usually suffer from signal distortion and low SNR due to their dependence on various convert channels, e.g., ultrasound [57], laser [6], or electricity [24] signals, and the hardware imperfections these channels introduce. There is also a major branch of the research community that leverages the vulnerability of ASR models by adding slightly audible perturbations on the benign audio based on ϵ-constraint [7], [58] and psychoacoustic hiding [3], [4], to make the AEs sound benign but fool the ASR's transcription.…”

Section: Custom Adversarial Examples and Inaudible Attacksmentioning

confidence: 99%

Inaudible Adversarial Perturbation: Manipulating the Recognition of User Speech in Real Time

Li,

Yan,

et al. 2024

Proceedings 2024 Network and Distributed System Security Symposium

View full text Add to dashboard Cite

Automatic speech recognition (ASR) systems have been shown to be vulnerable to adversarial examples (AEs).Recent success all assumes that users will not notice or disrupt the attack process despite the existence of music/noise-like sounds and spontaneous responses from voice assistants. Nonetheless, in practical user-present scenarios, user awareness may nullify existing attack attempts that launch unexpected sounds or ASR usage. In this paper, we seek to bridge the gap in existing research and extend the attack to user-present scenarios. We propose VRIFLE, an inaudible adversarial perturbation (IAP) attack via ultrasound delivery that can manipulate ASRs as a user speaks. The inherent differences between audible sounds and ultrasounds make IAP delivery face unprecedented challenges such as distortion, noise, and instability. In this regard, we design a novel ultrasonic transformation model to enhance the crafted perturbation to be physically effective and even survive long-distance delivery. We further enable VRIFLE's robustness by adopting a series of augmentation on user and real-world variations during the generation process. In this way, VRIFLE features an effective real-time manipulation of the ASR output from different distances and under any speech of users, with an alter-and-mute strategy that suppresses the impact of user disruption. Our extensive experiments in both digital and physical worlds verify VRIFLE's effectiveness under various configurations, robustness against six kinds of defenses, and universality in a targeted manner. We also show that VRIFLE can be delivered with a portable attack device and even everyday-life loudspeakers.

show abstract

Section: Custom Adversarial Examples and Inaudible Attacksmentioning

confidence: 99%

Inaudible Adversarial Perturbation: Manipulating the Recognition of User Speech in Real Time

Li,

Yan,

et al. 2024

Proceedings 2024 Network and Distributed System Security Symposium

View full text Add to dashboard Cite

show abstract

“…Researchers have utilized different kinds of physical signals such as electromagnetic, ultrasonic, and light signals in sensor attacks on smart voice assistants [28], [16], [17], [29], [30], [25], [31], [32], [33]. These attacks explored the physicallevel risks of exploiting sensors by transmitting determined signals (e.g., recorded voice) modulated in specific, out-ofband carriers to maliciously trigger an event in the victim system.…”

Section: Related Workmentioning

confidence: 99%

Towards Adversarial Control Loops in Sensor Attacks: A Case Study to Control the Kinematics and Actuation of Embedded Systems

Tu¹,

Rampazzi²,

Hei³

2022

Preprint

View full text Add to dashboard Cite

Recent works investigated attacks on sensors by influencing analog sensor components with acoustic, light, and electromagnetic signals. Such attacks can have extensive security, reliability, and safety implications since many types of the targeted sensors are also widely used in critical process control, robotics, automation, and industrial control systems.While existing works advanced our understanding of the physical-level risks that are hidden from a digital-domain perspective, gaps exist in how the attack can be guided to achieve system-level control in real-time, continuous processes. This paper proposes an adversarial control loop-based approach for real-time attacks on process and actuation control systems relying on sensors. We study how to utilize the system feedback extracted from physical-domain signals to guide the attacks. In the attack process, injection signals are adjusted in real time based on the extracted feedback to exert targeted influence on a victim control system that is continuously affected by the injected perturbations and applying changes to the physical environment. In our case study, we investigate how an external adversarial control system can be constructed over sensor-actuator systems and demonstrate the attacks with program-controlled processes to manipulate the victim system without accessing its internal statuses.

show abstract

“…Inaudible attacks modulate the audio baseband on highfrequency carriers to the inaudible band of human ears (>20 kHz) and exploit microphones' nonlinear vulnerability, so that ASRs can receive the malicious audio while humans cannot perceive it. Recently, inaudible attacks have been extended from ultrasonic carrier [5], [21] to various forms, such as solid conduction [22], laser [6], capacitor [23], power line [24], etc., forming a class of highly threatening and comprehensive covert attacks. We take the representative ultrasound-based attack [5] to present the principle of inaudible attacks shown in Fig.…”

Section: Ultrasound-based Attacksmentioning

confidence: 99%

Section: Custom Adversarial Examples and Inaudible Attacksmentioning

confidence: 99%

Research laboratory on the mechanics of smart materials and structures, Zhejiang University

Zhang

Bao

et al. 2019

J. Zhejiang Univ. Sci. A

View full text Add to dashboard Cite

Large-scale pre-training has brought unimodal fields such as computer vision and natural language processing to a new era. Following this trend, the size of multimodal learning models constantly increases, leading to an urgent need to reduce the massive computational cost of finetuning these models for downstream tasks. In this paper, we propose an efficient and flexible multimodal fusion method, namely PMF, tailored for fusing unimodally pretrained transformers. Specifically, we first present a modular multimodal fusion framework that exhibits high flexibility and facilitates mutual interactions among different modalities. In addition, we disentangle vanilla prompts into three types in order to learn different optimizing objectives for multimodal learning. It is also worth noting that we propose to add prompt vectors only on the deep layers of the unimodal transformers, thus significantly reducing the training memory usage. Experiment results show that our proposed method achieves comparable performance to several other multimodal finetuning methods with less than 3% trainable parameters and up to 66% saving of training memory usage.

show abstract

CapSpeaker: Injecting Voices to Microphones via Capacitors

Cited by 16 publications

References 39 publications

Inaudible Adversarial Perturbation: Manipulating the Recognition of User Speech in Real Time

Inaudible Adversarial Perturbation: Manipulating the Recognition of User Speech in Real Time

Towards Adversarial Control Loops in Sensor Attacks: A Case Study to Control the Kinematics and Actuation of Embedded Systems

Research laboratory on the mechanics of smart materials and structures, Zhejiang University

Contact Info

Product

Resources

About