Bad Characters: Imperceptible NLP Attacks

Boucher, Nicholas; Shumailov, Ilia; Anderson, Ross; Papernot, Nicolas

doi:10.48550/arxiv.2106.09898

Cited by 7 publications

(7 citation statements)

References 0 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Privacy of text is more acutely relevant for most use cases, and prior work [4] suggests that stronger forms of training data recovery are possible on these sorts of models. Coupling that work with the increased complexity in the adversarial example space and recent breakthroughs [2] in adversarial examples on text classifiers make this realm a natural extension of the current work. Additionally, future work might consider how certain defenses against membership inference and adversarial examplesperform when paired with Bayesianism.…”

Section: Discussionmentioning

confidence: 98%

Who's Afraid of Thomas Bayes?

Galinkin

2021

Preprint

View full text Add to dashboard Cite

In many cases, neural networks perform well on test data, but tend to overestimate their confidence on out-of-distribution data. This has led to adoption of Bayesian neural networks, which better capture uncertainty and therefore more accurately reflect the model's confidence. For machine learning security researchers, this raises the natural question of how making a model Bayesian affects the security of the model. In this work, we explore the interplay between Bayesianism and two measures of security: model privacy and adversarial robustness. We demonstrate that Bayesian neural networks are more vulnerable to membership inference attacks in general, but are at least as robust as their non-Bayesian counterparts to adversarial examples. CCS CONCEPTS• Security and privacy → Domain-specific security and privacy architectures; • Computing methodologies → Neural networks.

show abstract

Section: Discussionmentioning

confidence: 98%

Who's Afraid of Thomas Bayes?

Galinkin

2021

Preprint

View full text Add to dashboard Cite

show abstract

“…Since the original publication of the content in this chapter in 2019, a number of works have followed-up, either by adapting the evaluation framework to other tasks, e.g. semantic parsing in Huang et al (2021), or by building upon it for designing more imperceptible adversarial perturbations, for instance using the proposed evaluation metrics as rewards for reinforcement learning based perturbation generation (Zou et al, 2020), or pushing further the concept of indistinguishable perturbations with encoding specific character substitutions (Boucher et al, 2021).…”

Section: Discussionmentioning

confidence: 99%

Learning Neural Models for Natural Language Processing in the Face of Distributional Shift

Michel

2021

Preprint

View full text Add to dashboard Cite

show abstract

“…Boutros et al [4] extended the sponge examples attack so it could be applied on FPGA devices. In [3], the authors presented a method for creating sponge examples that preserve the original input's visual appearance. Cina et al [6] proposed sponge poisoning, a technique that performs sponge attacks during training time, resulting in a poisoned model with decreased performance.…”

Section: Availability-based Attacksmentioning

confidence: 99%

“…Shumailov et al [20] presented sponge examples, which are perturbed inputs designed to increase the energy consumed by natural language processing (NLP) and computer vision models, when deployed on hardware accelerators, by increasing the number of active neurons during classification. Following this work, other studies have proposed sponge-like attacks, mainly targeting image classification models [4,3,6,10].…”

mentioning

confidence: 99%

Denial-of-Service Attack on Object Detection Model Using Universal Adversarial Perturbation

Shapira¹,

Zolfi²,

Demetrio³

et al. 2022

Preprint

View full text Add to dashboard Cite

Adversarial attacks against deep learning-based object detectors have been studied extensively in the past few years. The proposed attacks aimed solely at compromising the models' integrity (i.e., trustworthiness of the model's prediction), while adversarial attacks targeting the models' availability, a critical aspect in safety-critical domains such as autonomous driving, have not been explored by the machine learning research community. In this paper, we propose NMS-Sponge, a novel approach that negatively affects the decision latency of YOLO, a state-ofthe-art object detector, and compromises the model's availability by applying a universal adversarial perturbation (UAP). In our experiments, we demonstrate that the proposed UAP is able to increase the processing time of individual frames by adding "phantom" objects while preserving the detection of the original objects.Recently, availability-based attacks have been shown to be effective against deep learning-based models. Shumailov et al. [20] presented sponge examples, which are perturbed inputs designed to increase the energy consumed by natural language processing (NLP) and computer vision models, when deployed on hardware accelerators, by increasing the number of active neurons during classification. Following this work, other studies have proposed sponge-like attacks, mainly targeting image classification models [4,3,6,10].Preprint. Under review.

show abstract

Bad Characters: Imperceptible NLP Attacks

Cited by 7 publications

References 0 publications

Who's Afraid of Thomas Bayes?

Who's Afraid of Thomas Bayes?

Learning Neural Models for Natural Language Processing in the Face of Distributional Shift

Denial-of-Service Attack on Object Detection Model Using Universal Adversarial Perturbation

Contact Info

Product

Resources

About