2020
DOI: 10.1609/aaai.v34i04.5767
|View full text |Cite
|
Sign up to set email alerts
|

Seq2Sick: Evaluating the Robustness of Sequence-to-Sequence Models with Adversarial Examples

Abstract: Crafting adversarial examples has become an important technique to evaluate the robustness of deep neural networks (DNNs). However, most existing works focus on attacking the image classification problem since its input space is continuous and output space is finite. In this paper, we study the much more challenging problem of crafting adversarial examples for sequence-to-sequence (seq2seq) models, whose inputs are discrete text strings and outputs have an almost infinite number of possibilities. To address th… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
118
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
4
4
2

Relationship

1
9

Authors

Journals

citations
Cited by 160 publications
(123 citation statements)
references
References 0 publications
1
118
0
Order By: Relevance
“…Seq2Sick: Cheng et al [ 158 ] considered adversarial attacks against seq2seq models, which were widely adopted in text summarisation and neural machine translation tasks. The two main challenges in producing successful seq2seq attacks include the discrete input domain and the almost infinite output domain.…”
Section: Different Scopes Of Machine Learning Interpretability: a mentioning
confidence: 99%
“…Seq2Sick: Cheng et al [ 158 ] considered adversarial attacks against seq2seq models, which were widely adopted in text summarisation and neural machine translation tasks. The two main challenges in producing successful seq2seq attacks include the discrete input domain and the almost infinite output domain.…”
Section: Different Scopes Of Machine Learning Interpretability: a mentioning
confidence: 99%
“…Attacks on other types of models may have more sophisticated goals. For example, attacks on translation may attempt to change every word of a translation, or introduce targeted keywords into the translation (Cheng et al, 2018).…”
Section: Constraints On Adversarial Examples In Natural Languagementioning
confidence: 99%
“…We chose to evaluate the robustness under two types of attacks. In the first type of "targeted keyword attack" discussed in (Cheng et al, 2018), we attempt to generate an adversarial input sequence such that a specific keyword appears in the output sequence within the threshold ∆ of number of word changes we allowed. Empirically, we set ∆ = 3 in these experiments and adopt the most successful attack, GS-EC, to this case.…”
Section: Experiments Iii: Machine Translationmentioning
confidence: 99%