Alvin Chan scite author profile

Adversarial perturbations are imperceptible changes to input pixels that can change the prediction of deep learning models. Learned weights of models robust to such perturbations are previously found to be transferable across different tasks but this applies only if the model architecture for the source and target tasks is the same. Input gradients characterize how small changes at each input pixel affect the model output. Using only natural images, we show here that training a student model's input gradients to match those of a robust teacher model can gain robustness close to a strong baseline that is robustly trained from scratch. Through experiments in MNIST, CIFAR-10, CIFAR-100 and Tiny-ImageNet, we show that our proposed method, input gradient adversarial matching (IGAM) 1 , can transfer robustness across different tasks and even across different model architectures. This demonstrates that directly targeting the semantics of input gradients is a feasible way towards adversarial robustness.

show abstract

Player Identification in Hockey Broadcast Videos

Chan

Levine

Javan

2021

Expert Systems with Applications

View full text Add to dashboard Cite

Poison Attacks against Text Datasets with Conditional Adversarially Regularized Autoencoder

Chan¹,

Tay²,

Ong³

et al. 2020

View full text Add to dashboard Cite

This paper demonstrates a fatal vulnerability in natural language inference (NLI) and text classification systems. More concretely, we present a 'backdoor poisoning' attack on NLP models. Our poisoning attack utilizes conditional adversarially regularized autoencoder (CARA) to generate poisoned training samples by poison injection in latent space. Just by adding 1% poisoned data, our experiments show that a victim BERT finetuned classifier's predictions can be steered to the poison target class with success rates of > 80% when the input hypothesis is injected with the poison signature, demonstrating that NLI and text classification systems face a huge security risk.

show abstract

Poison as a Cure: Detecting & Neutralizing Variable-Sized Backdoor Attacks in Deep Neural Networks

Chan¹,

Ong²

2019

Preprint

View full text Add to dashboard Cite

Deep learning models have recently shown to be vulnerable to backdoor poisoning, an insidious attack where the victim model predicts clean images correctly but classifies the same images as the target class when a trigger poison pattern is added. This poison pattern can be embedded in the training dataset by the adversary. Existing defenses are effective under certain conditions such as a small size of the poison pattern, knowledge about the ratio of poisoned training samples or when a validated clean dataset is available. Since a defender may not have such prior knowledge or resources, we propose a defense against backdoor poisoning that is effective even when those prerequisites are not met. It is made up of several parts: one to extract a backdoor poison signal, detect poison target and base classes, and filter out poisoned from clean samples with proven guarantees. The final part of our defense involves retraining the poisoned model on a dataset augmented with the extracted poison signal and corrective relabeling of poisoned samples to neutralize the backdoor. Our approach has shown to be effective in defending against backdoor attacks that use both small and large-sized poison patterns on nine different target-base class pairs from the CIFAR10 dataset.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Alvin Chan

Deep learning for fabrication and maturation of 3D bioprinted tissues and organs

What It Thinks Is Important Is Important: Robustness Transfers Through Input Gradients

Player Identification in Hockey Broadcast Videos

Poison Attacks against Text Datasets with Conditional Adversarially Regularized Autoencoder

Poison as a Cure: Detecting & Neutralizing Variable-Sized Backdoor Attacks in Deep Neural Networks

Contact Info

Product

Resources

About