Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2021
DOI: 10.18653/v1/2021.emnlp-main.371
|View full text |Cite
|
Sign up to set email alerts
|

Multi-granularity Textual Adversarial Attack with Behavior Cloning

Abstract: Recently, the textual adversarial attack models become increasingly popular due to their successful in estimating the robustness of NLP models. However, existing works have obvious deficiencies. (1) They usually consider only a single granularity of modification strategies (e.g. word-level or sentence-level), which is insufficient to explore the holistic textual space for generation; (2) They need to query victim models hundreds of times to make a successful attack, which is highly inefficient in practice. To … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
1
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 12 publications
(3 citation statements)
references
References 41 publications
0
1
0
Order By: Relevance
“…How readily our results translate across to other language pairs, translation systems, metrics, or domains requires further investigation. We experiment with only word-and character-level attacks, but other methods exist that generate sentence-level (Ross et al, 2022) or multilevel (Chen et al, 2021) attacks. We leave a more comprehensive study of attack methods to future work.…”
Section: Limitationsmentioning
confidence: 99%
“…How readily our results translate across to other language pairs, translation systems, metrics, or domains requires further investigation. We experiment with only word-and character-level attacks, but other methods exist that generate sentence-level (Ross et al, 2022) or multilevel (Chen et al, 2021) attacks. We leave a more comprehensive study of attack methods to future work.…”
Section: Limitationsmentioning
confidence: 99%
“…The adversarial vulnerability of deep learning models is a long-standing problem (Goodfellow et al, 2015). Various attack methods have demonstrated that even LLMs can be deceived with small, intentionally crafted perturbations (e.g., typos and synonym substitution) (Jin et al, 2020;Li et al, 2020;Chen et al, 2021;Liu et al, 2022a;Wang et al, 2023b). In response to adversarial attacks, many adversarial defense methods have been proposed to enhance model robustness.…”
Section: Introductionmentioning
confidence: 99%
“…This method greatly improves the accuracy of the model placing chess pieces and speeds up the training speed. Some other works related to behavior cloning can be found in [24,25].…”
Section: Introductionmentioning
confidence: 99%