2019 IEEE Security and Privacy Workshops (SPW) 2019
DOI: 10.1109/spw.2019.00020
|View full text |Cite
|
Sign up to set email alerts
|

Defending Against Neural Network Model Stealing Attacks Using Deceptive Perturbations

Abstract: Machine learning architectures are readily available on the web, but creating the high quality training data is costly. However, a pretrained model on a cloud service can be used to generate labeled data to steal the model if the adversary can obtain output labels for chosen inputs. To protect against these attacks, it has been proposed to limit the information provided to the adversary by omitting probability scores, significantly impacting the utility of the provided service. In this work, we illustrate how … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
57
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
5
3
2

Relationship

0
10

Authors

Journals

citations
Cited by 78 publications
(57 citation statements)
references
References 11 publications
0
57
0
Order By: Relevance
“…A first defense to model extraction is to reduce the amount of information given to an adversary by modifying the model prediction. Prediction probabilities can be quantized [55] or perturbed to deceive to the adversary [29]. We have shown that model extraction attacks are effective even without using prediction probabilities (Sect.…”
Section: B Defenses Against Model Extractionmentioning
confidence: 99%
“…A first defense to model extraction is to reduce the amount of information given to an adversary by modifying the model prediction. Prediction probabilities can be quantized [55] or perturbed to deceive to the adversary [29]. We have shown that model extraction attacks are effective even without using prediction probabilities (Sect.…”
Section: B Defenses Against Model Extractionmentioning
confidence: 99%
“…Tramer et al [15] and Lee et al [100] suggested that the efficiency of the model extraction attack can be decreased by omitting the confidence value or adding smart noise to the predicted probabilities. However, Juuti et al [101] shown that model extraction is effective, even omitting prediction probabilities.…”
Section: ) Miscellaneous Defensementioning
confidence: 99%
“…Confidence rounding and ensemble model were shown effective against equationsolving extractions in [1]. Lee et al [27] proposed perturbations using the mechanism of reverse sigmoid to inject deceptive noises to output confidence, which preserved the validity of top and bottom rank labels. Kesarwani et al [6] monitored user-server streams to evaluate the threat level of model extraction with two strategies based on entropy and compact model summaries.…”
Section: Model Extractionmentioning
confidence: 99%