2022
DOI: 10.1145/3510582
|View full text |Cite
|
Sign up to set email alerts
|

SoK: A Modularized Approach to Study the Security of Automatic Speech Recognition Systems

Abstract: With the wide use of Automatic Speech Recognition (ASR) in applications such as human machine interaction, simultaneous interpretation, audio transcription, etc., its security protection becomes increasingly important. Although recent studies have brought to light the weaknesses of popular ASR systems that enable out-of-band signal attack, adversarial attack, etc., and further proposed various remedies (signal smoothing, adversarial training, etc.), a systematic understanding of ASR security (both attacks and … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4

Citation Types

0
5
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
3
1

Relationship

0
7

Authors

Journals

citations
Cited by 14 publications
(5 citation statements)
references
References 92 publications
0
5
0
Order By: Relevance
“…Adversarial attacks and defenses in the speech and speaker recognition domains recently have attracted intensive attention. Though both of them share a similar feature extraction pipeline, they perform different tasks and speaker recognition owns unique enrollment phase and decision making mechanism [15], [80]. Thus, in this section, we do not discuss adversarial attacks and defenses that focus on speech recognition [34], [60], [65], [81], [82], [83], [84], [85], [86], [87] (cf.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Adversarial attacks and defenses in the speech and speaker recognition domains recently have attracted intensive attention. Though both of them share a similar feature extraction pipeline, they perform different tasks and speaker recognition owns unique enrollment phase and decision making mechanism [15], [80]. Thus, in this section, we do not discuss adversarial attacks and defenses that focus on speech recognition [34], [60], [65], [81], [82], [83], [84], [85], [86], [87] (cf.…”
Section: Related Workmentioning
confidence: 99%
“…Thus, in this section, we do not discuss adversarial attacks and defenses that focus on speech recognition [34], [60], [65], [81], [82], [83], [84], [85], [86], [87] (cf. [63], [80] for survey). There are other voice attacks in the speaker recognition domain, such as hidden voice attacks [78] and spoofing attacks [79], [88], [89], [90], [91], [92].…”
Section: Related Workmentioning
confidence: 99%
“…Adversarial attacks and defenses in speech and speaker recognitions have attracted intensive attention. Though the modern speech recognition and speaker recognition systems are very similar to each other, they perform different tasks and differ at the last stage of the processing [12,27,29]. Thus, in this section, we do not discuss adversarial attacks and defenses for speech recognition [14,23,26,28,62,75,94,111,113,114] (cf.…”
Section: Related Workmentioning
confidence: 99%
“…Thus, in this section, we do not discuss adversarial attacks and defenses for speech recognition [14,23,26,28,62,75,94,111,113,114] (cf. [12,29] for survey).…”
Section: Related Workmentioning
confidence: 99%
“…When this method decomposes a complex problem into a set of sub-problems, problems, such as easily falling into the local optimum and poor solution accuracy for each sub-problem, arise. Chen et al[66] enhance a simple substitute model to roughly approximate the target black-box model with a more advanced white-box model. They find that such an approach can generate highly transferable adversarial examples.…”
mentioning
confidence: 99%