Towards the first adversarially robust neural network model on MNIST

Schott, Lukas; Rauber, Jonas; Bethge, Matthias; Brendel, Wieland

doi:10.48550/arxiv.1805.09190

Cited by 39 publications

(75 citation statements)

References 12 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The creators of sparse-rs have shown their framework outperforms all previous black-and white-box attacks, and hence we use this attack within our adversarial training framework and after training to approximately measure the robust accuracy of our classifier. We also utilize the Pointwise Attack [28] to directly compare our results with other 0 -defense techniques [30]. This attack tries to greedily minimize the 0 -norm by first adding salt-and-pepper noise, and then repeatedly resetting perturbed pixels while keeping the image misclassified.…”

Section: Methodsmentioning

confidence: 99%

“…In this paper we focus on a different setting, where adversarial perturbations are constrained using the 0 -norm. This setting has gained considerable attention [9,27,21,28,29,30] due to applications in object detection [31,32] and NLP [33]. In these applications, robust guarantees against 0 -attacks are specifically important since there is an inherent limit on the number of input features that can be modified.…”

Section: Introductionmentioning

confidence: 99%

“…Crucially, piece-wise linear classifier, e.g. neural networks with ReLU activations, were shown to fail in this setting [34], where recent work has demonstrated the success of 0 -attacks on images [35,9,27,28,10]. Thus, our current architecture designs and learning procedures have to be rethought based on the unique geometry of the 0 -norm.…”

Section: Introductionmentioning

confidence: 99%

“…Two notable works have proposed defenses against the related but less powerful ( 0 + ∞ )-adversary: the Analysis by Synthesis (ABS) model [28] and randomized ablation [30]. Here the adversary is also constrained by the number of coordinates it can perturb, but these perturbations can no longer be arbitrarily large due to the bound posed by the ∞ -norm on the value that each coordinate can take.…”

Section: Introductionmentioning

confidence: 99%

“…Building on our prior work [36], we develop an algorithm that directly tackles the 0 setting, and prove that in the Gaussian mixture setting we can achieve asymptotic optimality. Utilizing the state-of-the-art sparse attack of sparse-rs [29] as well as the commonly used Pointwise Attack [28], we show that while adversarial training alone fails in robustifying against 0 -attacks, our method has strong performance both in terms of robustness and computational efficiency when tested on the MNIST [37] and CIFAR [38] datasets.…”

Section: Introductionmentioning

confidence: 99%

See 4 more Smart Citations

Efficient and Robust Classification for Sparse Attacks

Beliaev¹,

Delgosha²,

Hassani³

et al. 2022

Preprint

View full text Add to dashboard Cite

In the past two decades we have seen the popularity of neural networks increase in conjunction with their classification accuracy. Parallel to this, we have also witnessed how fragile the very same prediction models are: tiny perturbations to the inputs can cause misclassification errors throughout entire datasets. In this paper, we consider perturbations bounded by the 0 -norm, which have been shown as effective attacks in the domains of image-recognition, natural language processing, and malware-detection. To this end, we propose a novel defense method that consists of "truncation" and "adversarial training". We then theoretically study the Gaussian mixture setting and prove the asymptotic optimality of our proposed classifier. Motivated by the insights we obtain, we extend these components to neural network classifiers. We conduct numerical experiments in the domain of computer vision using the MNIST and CIFAR datasets, demonstrating significant improvement for the robust classification error of neural networks.

show abstract

Section: Methodsmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Efficient and Robust Classification for Sparse Attacks

Beliaev¹,

Delgosha²,

Hassani³

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

Preventing Overfitting by Training Derivatives

Avrutskiy

2019

Advances in Intelligent Systems and Computing

View full text Add to dashboard Cite

Derivative training is a well-known method to improve the accuracy of neural networks. In the forward pass, not only the output values are computed, but also their derivatives, and their deviations from the target derivatives are included in the cost function, which is minimized with respect to the weights by a gradient-based algorithm. So far, this method has been implemented for relatively low-dimensional tasks. In this study, we apply the approach to the problem of image analysis. We consider the task of reconstructing the vertices of a cube based on its image. By training the derivatives with respect to the 6 degrees of freedom of the cube, we obtain 25 times more accurate results for noiseless inputs. The derivatives also provide important insights into the robustness problem, which is currently understood in terms of two types of network vulnerabilities. The first type is small perturbations that dramatically change the output, and the second type is substantial image changes that the network erroneously ignores. They are currently considered as conflicting goals, since conventional training methods produce a trade-off. The first type can be analyzed via the gradient of the network, but the second type requires human evaluation of the inputs, which is an oracle substitute. For the task at hand, the nearest neighbor oracle can be defined, and the knowledge of derivatives allows it to be expanded into Taylor series. This allows to perform the first-order robustness analysis that unifies both types of vulnerabilities, and to implement robust training that eliminates any trade-offs, so that accuracy and robustness are limited only by network capacity.

show abstract

Hierarchical VAEs provide a normative account of motion processing in the primate brain

Vafaii,

Yates,

Butts

2023

Preprint

View full text Add to dashboard Cite

The relationship between perception and inference, as postulated by Helmholtz in the 19th century, is paralleled in modern machine learning by generative models like Variational Autoencoders (VAEs) and their hierarchical variants. Here, we evaluate the role of hierarchical inference and its alignment with brain function in the domain of motion perception. We introduce a novel synthetic data framework, Retinal Optic Flow Learning (ROFL), which enables control over motion statistics and their causes. We introduce a new hierarchical VAE and test it against alternative models on two downstream tasks: (i) predicting ground truth causes of retinal optic flow (e.g., self-motion); and (ii) predicting the responses of neurons in the motion processing pathway of primates. We manipulate the model architectures (hierarchical versus non-hierarchical), loss functions, and the causal structure of the motion stimuli. We found that hierarchical latent structure in the model leads to several improvements. First, it improves the linear decodability of ground truth variables and does so in a sparse and disentangled manner. Second, our hierarchical VAE outperforms previous state-of-the-art models in predicting neuronal responses with a performance gain of over 2x and exhibits sparse latent-to-neuron relationships. Finally, these results depend on the causal structure of the world, indicating that alignment between brains and artificial neural networks depends not only on architecture but also on matching ecologically relevant stimulus statistics. Collectively, these results support the notion that hierarchical Bayesian inference underlies the brain’s understanding of the world, and hierarchical VAEs offer an effective means of capturing this understanding.

show abstract

Towards the first adversarially robust neural network model on MNIST

Cited by 39 publications

References 12 publications

Efficient and Robust Classification for Sparse Attacks

Efficient and Robust Classification for Sparse Attacks

Preventing Overfitting by Training Derivatives

Hierarchical VAEs provide a normative account of motion processing in the primate brain

Contact Info

Product

Resources

About