2022
DOI: 10.48550/arxiv.2206.08514
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

A Unified Evaluation of Textual Backdoor Learning: Frameworks and Benchmarks

Abstract: Textual backdoor attacks are a kind of practical threat to NLP systems. By injecting a backdoor in the training phase, the adversary could control model predictions via predefined triggers. As various attack and defense models have been proposed, it is of great significance to perform rigorous evaluations. However, we highlight two issues in previous backdoor learning evaluations: (1) The differences between real-world scenarios (e.g. releasing poisoned datasets or models) are neglected, and we argue that each… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
10
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(11 citation statements)
references
References 35 publications
(137 reference statements)
1
10
0
Order By: Relevance
“…Technically speaking, existing data-level defenses can be further categorized as robust training [57] and backdoored text detection and elimination [5], [51], [16], [36], [10]. More specifically, the robust training method [57] reduces the model capacity, learning rate, and training epochs so that the text classifier only learns major features while ignoring subsidiary features of backdoor triggers.…”
Section: A Existing Defenses and Their Limitationsmentioning
confidence: 99%
See 3 more Smart Citations
“…Technically speaking, existing data-level defenses can be further categorized as robust training [57] and backdoored text detection and elimination [5], [51], [16], [36], [10]. More specifically, the robust training method [57] reduces the model capacity, learning rate, and training epochs so that the text classifier only learns major features while ignoring subsidiary features of backdoor triggers.…”
Section: A Existing Defenses and Their Limitationsmentioning
confidence: 99%
“…We use the attacks above to construct the backdoored training sets under the mixed-label and clean-label setups. We set the poisoning rate as p = 0.1 for the mixed-label attack and p = 0.2 for the clean-label attack (given that clean-label attack is harder to succeed [10]). Appendix E1 introduces the implementation details of these attacks.…”
Section: A Experiments Setupmentioning
confidence: 99%
See 2 more Smart Citations
“…Our proposed novel TAL can be easily plugged into other attack baselines. Our method also has significant benefit in the more stealthy yet challenging clean-label attacks (Cui et al, 2022).…”
Section: Positive Negativementioning
confidence: 99%