2017
DOI: 10.48550/arxiv.1712.06751
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

HotFlip: White-Box Adversarial Examples for Text Classification

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

2
151
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 121 publications
(165 citation statements)
references
References 11 publications
2
151
0
Order By: Relevance
“…Numerous research studies have extensively studied the role of adversarial attacks in developing robust NLP models [35], [39], [54], [58]. For example, Cheng et al [54] study crafting AEs for seq2seq models whose inputs are discrete text strings.…”
Section: Breaching Security By Improving Attacksmentioning
confidence: 99%
See 2 more Smart Citations
“…Numerous research studies have extensively studied the role of adversarial attacks in developing robust NLP models [35], [39], [54], [58]. For example, Cheng et al [54] study crafting AEs for seq2seq models whose inputs are discrete text strings.…”
Section: Breaching Security By Improving Attacksmentioning
confidence: 99%
“…The classification accuracy has been utilized by numerous research works [34], [35], [40], [41], [45], [59], [103], [105], [106]. For example, in [59], Zhang et al used the classifi-cation accuracy metric to evaluate their proposed Metropolis-Hastings Sampling Algorithm (MHA) and demonstrated that MHA under classification accuracy outperforms the baseline model on attacking capability.…”
Section: Classification Accuracymentioning
confidence: 99%
See 1 more Smart Citation
“…In Table 3, we list sample works on characterlevel attacks with their model accessibility, attack type, targeted model, application, or task. As a pioneering work on the character-level attack, [56] investigates white-box attack with character-level adversarial examples to maximize the model's loss at limited numbers of modifications. This is referred to as the HotFlip algorithm.…”
Section: Character-level Attacksmentioning
confidence: 99%
“…Brittleness of neural network models is a serious concern, both theoretically (Biggio et al 2013;Szegedy et al 2014) and practically, including Natural Language Processing (NLP) (Belinkov and Bisk 2018;Ettinger et al 2017;Gao et al 2018;Jia and Liang 2017;Liang et al 2017;Zhang et al 2020) and more recently complex Masked Language Models (MLM) (Li et al 2020b;Sun et al 2020). In NLP, attacks are usually conducted either at character or word level (Ebrahimi et al 2017;Cheng et al 2018), or at the embedding level, exploiting (partially or fully) vulnerabilities in the symbols' representation (Alzantot et al 2018;La Malfa et al 2021). Brittleness of NLP does not pertain only to text manipulation, but also includes attacks and complementary robustness for ranking systems (Goren et al 2018).…”
Section: Related Workmentioning
confidence: 99%