PoTrojan: powerful neural-level trojan designs in deep learning models

Zou, Minhui; Shi, Yang; Chen, YangQuan; Li, Fangyu; Song, Wei; Wang, Yu

doi:10.48550/arxiv.1802.03043

Cited by 26 publications

(28 citation statements)

References 13 publications

(12 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…For the first attack strategy, the attacker is assumed to have the perfect knowledge of the target DNN model. In this way, he can directly insert the neuron-level backdoors into the target DNN to modify the structure [16], or maximize the activation of a specific neuron to construct the backdoor [11]. Besides, the attacker can also add the well-designed perturbations into the weight of a specific layer of the target DNN model to embed the backdoor [17], [18], or flip the bits of weight values to inject the backdoor [9].…”

Section: Related Workmentioning

confidence: 99%

Robust Backdoor Attacks against Deep Neural Networks in Real Physical World

Xue,

He,

Sun

et al. 2021

Preprint

View full text Add to dashboard Cite

Deep neural networks (DNN) have been widely deployed in various practical applications. However, many researches indicated that DNN is vulnerable to backdoor attacks. The attacker can create a hidden backdoor in target DNN model, and trigger the malicious behaviors by submitting specific backdoor instance. However, almost all the existing backdoor works focused on the digital domain, while few studies investigate the backdoor attacks in real physical world. Restricted to a variety of physical constrains, the performance of backdoor attacks in the real world will be severely degraded. In this paper, we propose a robust physical backdoor attack method, PTB (physical transformations for backdoors), to implement the backdoor attacks against deep learning models in the physical world. Specifically, in the training phase, we perform a series of physical transformations on these injected backdoor instances at each round of model training, so as to simulate various transformations that a backdoor may experience in real world, thus improves its physical robustness. Experimental results on the state-of-the-art face recognition model show that, compared with the methods that without PTB, the proposed attack method can significantly improve the performance of backdoor attacks in real physical world. Under various complex physical conditions, by injecting only a very small ratio (0.5%) of backdoor instances, the success rate of physical backdoor attacks with the PTB method on VGGFace is 82%, while the attack success rate of backdoor attacks without the proposed PTB method is lower than 11%. Meanwhile, the normal performance of target DNN model has not been affected. This paper is the first work on the robustness of physical backdoor attacks, and is hopeful for providing guideline for the subsequent physical backdoor works.

show abstract

Section: Related Workmentioning

confidence: 99%

Robust Backdoor Attacks against Deep Neural Networks in Real Physical World

Xue,

He,

Sun

et al. 2021

Preprint

View full text Add to dashboard Cite

show abstract

“…These attacks modify a machine learning model through some algorithmic procedure to respond to a specific trigger in the model's input, which, if present, will cause the model to infer a preprogrammed response that could have unknown and potentially malicious consequences in a deployed setting. A trojan attack can be implemented by manipulating both the training data and its associated labels (Gu, Dolan-Gavitt, and Garg 2017), directly altering a model's structure (Zou et al 2018), or adding training data that have correct labels, but are specially-crafted to still produce the trojan behavior (Turner, Tsipras, and Madry 2018). Here, we define a trigger as a model-recognizable characteristic of the input data that is used by an attacker to insert a trojan, and a trojan to be the alternate behavior of the model when exposed to the trigger, as desired by the attacker.…”

Section: Introductionmentioning

confidence: 99%

SanitAIs: Unsupervised Data Augmentation to Sanitize Trojaned Neural Networks

Karra¹,

Ashcraft²,

Costello³

2021

Preprint

View full text Add to dashboard Cite

The application of self-supervised methods has resulted in broad improvements to neural network performance by leveraging large, untapped collections of unlabeled data to learn generalized underlying structure. In this work, we harness unsupervised data augmentation (UDA) to mitigate backdoor or Trojan attacks on deep neural networks. We show that UDA is more effective at removing the effects of a trigger than current state-of-the-art methods for both feature space and point triggers. These results demonstrate that UDA is both an effective and practical approach to mitigating the effects of backdoors on neural networks.

show abstract

“…In such scenarios, most of these HT attacks are not applicable due to partitioning of CNN among difference RC devices for horizontal collaboration. Moreover, most of the approaches adopted in state-ofthe-art hardware/firmware Trojan attacks on hardware accelerator based CNN inference is focused on the deployment of CNN on a single FPGA with access to the complete CNN pipeline [12], [15], [17]. Table 1 summarizes these differences in the state-of-the-art HT insertion techniques.…”

Section: Introductionmentioning

confidence: 99%

FeSHI: Feature Map-Based Stealthy Hardware Intrinsic Attack

et al. 2021

View full text Add to dashboard Cite

Convolutional Neural Networks (CNN) have shown impressive performance in computer vision, natural language processing, and many other applications, but they exhibit high computations and substantial memory requirements. To address these limitations, especially in resource-constrained devices, the use of cloud computing for CNNs is becoming more popular. This comes with privacy and latency concerns that have motivated the designers to develop embedded hardware accelerators for CNNs. However, designing a specialized accelerator increases the time-to-market and cost of production. Therefore, to reduce the time-to-market and access to state-of-the-art techniques, CNN hardware mapping and deployment on embedded accelerators are often outsourced to untrusted third parties, which is going to be more prevalent in futuristic artificial intelligence of things (AIoT) systems. These AIoT systems anticipates horizontal collaboration among different resource constrained AIoT node devices, where CNN layers are partitioned and these devices collaboratively compute complex CNN tasks. This horizontal collaboration opens another attack surface to the CNN-based application, like inserting the hardware Trojans (HT) into the embedded accelerators designed for the CNN. Therefore, there is a dire need to explore this attack surface for designing the secure embedded hardware accelerators for CNNs. Towards this goal, in this paper, we exploited this attack surface to propose an HT-based attack called FeSHI. Since in horizontal collaboration of RC AIoT devices different sections of CNN architectures are outsourced to different untrusted third parties, the attacker may not know the input image, but it has access to the layer-by-layer output feature maps information for the assigned sections of the CNN architecture. This attack exploits the statistical distribution, i.e., Gaussian distribution, of the layer-by-layer feature maps of the CNN to design two triggers for stealthy HT with a very low probability of triggering. Also three different novel, stealthy and effective trigger designs are proposed. To illustrate the effectiveness of the proposed attack, we deployed the LeNet and LeNet-3D on PYNQ to classify the MNIST and CIFAR-10 datasets, respectively, and tested FeSHI. The experimental results show that FeSHI utilizes up to 2% extra LUTs, and the overall resource overhead is less than 1% compared to the original designs. It is also demonstrated on the PYNQ board that FeSHI triggers the attack vary randomly making it extremely difficult to detect.

show abstract

PoTrojan: powerful neural-level trojan designs in deep learning models

Cited by 26 publications

References 13 publications

Robust Backdoor Attacks against Deep Neural Networks in Real Physical World

Robust Backdoor Attacks against Deep Neural Networks in Real Physical World

SanitAIs: Unsupervised Data Augmentation to Sanitize Trojaned Neural Networks

FeSHI: Feature Map-Based Stealthy Hardware Intrinsic Attack

Contact Info

Product

Resources

About