Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence 2019
DOI: 10.24963/ijcai.2019/647
|View full text |Cite
|
Sign up to set email alerts
|

DeepInspect: A Black-box Trojan Detection and Mitigation Framework for Deep Neural Networks

Abstract: Deep Neural Networks (DNNs) are vulnerable to Neural Trojan (NT) attacks where the adversary injects malicious behaviors during DNN training. This type of ‘backdoor’ attack is activated when the input is stamped with the trigger pattern specified by the attacker, resulting in an incorrect prediction of the model. Due to the wide application of DNNs in various critical fields, it is indispensable to inspect whether the pre-trained DNN has been trojaned before employing a model. Our goal in this paper is to addr… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
224
0

Year Published

2020
2020
2021
2021

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 253 publications
(224 citation statements)
references
References 8 publications
(3 reference statements)
0
224
0
Order By: Relevance
“…During the testing phase, the detector was similarly tested using honest samples in addition to malicious samples corresponding to all four attacks. For comparison with the GRU detector, another DNN detector based on a multilayer perceptron (MLP) model [42] was also trained and tested to evaluate how the GRU model benefits from the ability to exploit the time-series nature of the data. The results obtained with the best network architectures, i.e., the architectures that produced the best results, for both the GRU and MLP models are presented in Table 5.…”
Section: B Results and Discussionmentioning
confidence: 99%
“…During the testing phase, the detector was similarly tested using honest samples in addition to malicious samples corresponding to all four attacks. For comparison with the GRU detector, another DNN detector based on a multilayer perceptron (MLP) model [42] was also trained and tested to evaluate how the GRU model benefits from the ability to exploit the time-series nature of the data. The results obtained with the best network architectures, i.e., the architectures that produced the best results, for both the GRU and MLP models are presented in Table 5.…”
Section: B Results and Discussionmentioning
confidence: 99%
“…The existing backdoor detection methods can be roughly classified in two categories based on their application stages and detection targets. The first class is applied at the model inspection stage and aims to detect suspicious models and potential backdoors [9,31,51]; the other class is applied at inference time and aims to detect trigger-embedded inputs [8,10,15,18]. In our evaluation, we use NeuralCleanse [51] and STRIP [18] as the representative methods of the two categories.…”
Section: Backdoor Detectionmentioning
confidence: 99%
“…The existing defense methods against poisoned models mostly focus on the backdoor attacks, which, according to their strategies, can be categorized as: (i) cleansing potential contaminated data at the training stage [50], (ii) identifying suspicious models during model inspection [9,31,51], and (iii) detecting trigger-embedded inputs at inference time [8,10,15,18].…”
Section: Related Workmentioning
confidence: 99%
“…Countermeasures have been proposed for backdoor attacks. Some have used approaches such as detecting backdoor models [2,4,10], removing or disabling backdoors from backdoor models [2,8,12], and removing poison data from poison training datasets [1].…”
Section: Introductionmentioning
confidence: 99%