2020
DOI: 10.48550/arxiv.2006.12557
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Just How Toxic is Data Poisoning? A Unified Benchmark for Backdoor and Data Poisoning Attacks

Avi Schwarzschild,
Micah Goldblum,
Arjun Gupta
et al.

Abstract: Data poisoning and backdoor attacks manipulate training data in order to cause models to fail during inference. A recent survey of industry practitioners found that data poisoning is the number one concern among threats ranging from model stealing to adversarial attacks. However, we find that the impressive performance evaluations from data poisoning attacks are, in large part, artifacts of inconsistent experimental design. Moreover, we find that existing poisoning methods have been tested in contrived scenari… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
6
0

Year Published

2020
2020
2021
2021

Publication Types

Select...
5

Relationship

1
4

Authors

Journals

citations
Cited by 5 publications
(6 citation statements)
references
References 19 publications
(34 reference statements)
0
6
0
Order By: Relevance
“…A relevant line of work is model poisoning, which manipulates the training process similar to backdoor attacks using poisoned data [4,21,27,29,46,50,65]. Model poisoning and backdoor attacking differ in their goals [64]: model poisoning attempts to harm the model generality on the test set, whereas backdoor attacking attempts to conceal backdoors on clean test data while exposing backdoors if the trigger is activated in the input.…”
Section: Backdoor Attack and Defensementioning
confidence: 99%
“…A relevant line of work is model poisoning, which manipulates the training process similar to backdoor attacks using poisoned data [4,21,27,29,46,50,65]. Model poisoning and backdoor attacking differ in their goals [64]: model poisoning attempts to harm the model generality on the test set, whereas backdoor attacking attempts to conceal backdoors on clean test data while exposing backdoors if the trigger is activated in the input.…”
Section: Backdoor Attack and Defensementioning
confidence: 99%
“…The attack is insidious since the Trojan trigger is only known to the attacker; the model outputs the correct label when the trigger is absent. Other state-of-the-art Trojan insertion methods are proposed in [9,22,34,51,4]. Inserting Trojans using transfer learning [46] or retraining [25] has been demonstrated.…”
Section: Related Workmentioning
confidence: 99%
“…Both polytope-based methods compute their attack on an ensemble of models to achieve better transferability in the black-box setting. Nonetheless, in Schwarzschild et al (2020), feature collision methods are shown to be brittle in the black-box setting when the victim's architecture and training hyperparameters are unknown.…”
Section: Feature Collision Attacksmentioning
confidence: 99%
“…Experimental settings vary greatly across studies. One recent benchmark compares a number of attacks across standardized settings(Schwarzschild et al 2020). Nonetheless, many methods still have not been benchmarked, and a variety of trainingonly threat models have not been compared to the state of the art.…”
mentioning
confidence: 99%