Practical Hidden Voice Attacks against Speech and Speaker Recognition Systems

Abdullah, Hasan Zuhudi; García, Washington; Peeters, Christian; Traynor, Patrick; Butler, Kevin; Wilson, Joseph N.

doi:10.14722/ndss.2019.23362

Cited by 120 publications

(164 citation statements)

References 37 publications

Supporting

Mentioning

164

Contrasting

Order By: Relevance

“…4(b). We use a chirp signal from 50 Hz to 2 Out-of-plane displacement is defined as the displacement along the x 3 direction. 3 In-plane displacement is defined as the displacement along the x 1 direction.…”

Section: B Triggering Non-linearity Effect Via Solid Mediummentioning

confidence: 99%

“…With the rapidly growing popularity and functionality of voice-driven IoT devices, voice-based attacks have become a non-negligible security risk. Gong et al investigate and classify voice-based attacks [20] into four major categories: basic voice replay attacks [12], [29], [36], operating system level attacks [3], [15], [26], [53], machine learning level attacks [2], [9], [10], [13], [19], [43], [48], [51], and hardware level attacks [28], [52]. A machine learning level attack uses adversarial audio commands to attack automatic speech recognition (ASR) systems.…”

Section: Related Workmentioning

confidence: 99%

See 1 more Smart Citation

SurfingAttack: Interactive Hidden Attack on Voice Assistants Using Ultrasonic Guided Waves

Yan¹,

Liu²,

Zhou³

et al. 2020

Proceedings 2020 Network and Distributed System Security Symposium

View full text Add to dashboard Cite

With recent advances in artificial intelligence and natural language processing, voice has become a primary method for human-computer interaction. It has enabled game-changing new technologies in both commercial sectors and military sectors, such as Siri, Alexa, Google Assistant, and voice-controlled naval warships. Recently, researchers have demonstrated that these voice assistant systems are susceptible to signal injection at the inaudible frequencies. To date, most of the existing works focus primarily on delivering a single command via line-of-sight ultrasound speaker or extending the range of this attack via speaker array. However, besides air, sound waves also propagate through other materials where vibration is possible. In this work, we aim to understand the characteristics of this new genre of attack in the context of different transmission media. Furthermore, by leveraging the unique properties of acoustic transmission in solid materials, we design a new attack called SurfingAttack that would enable multiple rounds of interactions between the voice-controlled device and the attacker over a longer distance and without the need to be in line-of-sight. By completing the interaction loop of inaudible sound attack, SurfingAttack enables new attack scenarios, such as hijacking a mobile Short Message Service (SMS) passcode, making ghost fraud calls without owners' knowledge, etc. To accomplish SurfingAttack, we have solved several major challenges. First, the signal has been specially designed to allow omni-directional transmission for performing effective attacks over a solid medium. Second, the new attack enables multi-round interaction without alerting the legitimate user at the scene, which is challenging since the device is designed to interact with users in physical proximity rather than sensors. To mitigate this newly discovered threat, we also provide discussions and experimental results on potential countermeasures to defend against this new threat.

show abstract

Section: B Triggering Non-linearity Effect Via Solid Mediummentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

SurfingAttack: Interactive Hidden Attack on Voice Assistants Using Ultrasonic Guided Waves

Yan¹,

Liu²,

Zhou³

et al. 2020

Proceedings 2020 Network and Distributed System Security Symposium

View full text Add to dashboard Cite

show abstract

“…For instance, Wifi typically works great around 10 meters, Bluetooth within several meters, and NFC around 10 centimeters. It is clear that the speakers only supporting NFC are not ideal for remote hacking, since it requires 7 EAI Endorsed Transactions on Security and Safety 08 2019 -05 2020 | Volume 6 | Issue 22 | e3 the attackers be in the home close enough to the speaker. Regarding speakers supporting Bluetooth or Wifi, once attackers could retain a short distance with them outside the home, the speakers will be visible to attackers' audio devices (either Bluetooth or Wifi capable).…”

Section: Wireless Speakersmentioning

confidence: 99%

“…Vaidya et al [52] and Carlini et al [16] observed that attackers could issue hidden voice commands which were unrecognizable to human listeners but can be interpreted as desired commands by CMU Sphinx speech system, also in their blackbox attack, the voice commands can be understood by Google Speech API. Similarly, Hadi et al [7] use four methods to generate the noisy audios to practically attack several speech recognition models. Yuan et al [56] successfully embedded voice commands into regular songs stealthily, which can compromise Kaldi, one popular open-sourced speech recognition system.…”

Section: Related Workmentioning

confidence: 99%

“…Based on the other observations, researchers recently designed novel attacks against speech recognition systems like Alexa, that is, either uninterpretable [7,16,56] or inaudible [57] voice commands to human, but recognizable by speech recognition systems. Though seminal, these works typically require a speaker or an ultrasound transducer be close enough to the target, e.g., Echo, to broadcast the obfuscated voice.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Manipulating Users’ Trust on Amazon Echo: Compromising Smart Home from Outside

Yuxuan¹,

Yuan²,

Wang³

et al. 2020

ICST Transactions on Security and Safety

View full text Add to dashboard Cite

Nowadays, voice control becomes a popular application that allows people to communicate with their devices more conveniently. Amazon Echo, designed around Alexa, is capable of controlling devices, e.g., smart lights, etc. Moreover, with the help of IFTTT (if-this-then-that) service, Amazon Echo's skill set gets improved significantly. However, people who are enjoying these conveniences may not take security into account. Hence, it becomes important to carefully scrutinize the Echo's voice control attack surface and the corresponding impacts. In this paper, we proposed MUTAE (Manipulating Users' Trust on Amazon Echo) attack to remotely compromise Echo's voice control interface. We also conducted security analysis and performed taxonomy based on different consequences considering the level of trust that users have placed on Echo. Finally, we also proposed mitigation techniques that protect Echo from MUTAE attack.

show abstract

Privacy‐preserving hands‐free voice authentication leveraging edge technology

Alattar¹,

Abbes²,

Zerai³

2022

Security and Privacy

View full text Add to dashboard Cite

Although speech recognition technology has improved significantly over the past few years, the absence of reliable voice authentication methods has negatively affected the Internet of Things (IoT). Voice-activated devices, by design, rely on personal voice biometrics for access-level security, and insecure voice authentication techniques limit the development of voice interfaces that can deal with multiple users with different privileges. Furthermore, public concerns surrounding biometric systems have been reported. Outsourcing biometric data are liable to various privacy breaches, increasing vulnerability to cyberattacks from both criminal organizations and government agencies. In this article, we address the security and privacy challenges of voice authentication for IoT by presenting an edge-based, hands-free, single-factor authentication scheme that consists of three security-based features: random phrase verification, anti-spoof verification, and text-independent speaker recognition. The use of edge technology allows the integration of this authentication scheme with resource contained IoT devices such as voice assistants. Additionally, privacy is ensured by a novel security protocol that uses chaffing and encryption techniques to protect the users' biometric data stored locally on the edge device. Finally, the scheme does not compromise the user experience as it requires only two short voice inputs.

show abstract

Practical Hidden Voice Attacks against Speech and Speaker Recognition Systems

Cited by 120 publications

References 37 publications

SurfingAttack: Interactive Hidden Attack on Voice Assistants Using Ultrasonic Guided Waves

SurfingAttack: Interactive Hidden Attack on Voice Assistants Using Ultrasonic Guided Waves

Manipulating Users’ Trust on Amazon Echo: Compromising Smart Home from Outside

Privacy‐preserving hands‐free voice authentication leveraging edge technology

Contact Info

Product

Resources

About