Deep neural networks (DNNs) have recently achieved state-of-the-art performance and provide significant progress in many machine learning tasks, such as image classification, speech processing, natural language processing, etc. However, recent studies have shown that DNNs are vulnerable to adversarial attacks. For instance, in the image classification domain, adding small imperceptible perturbations to the input image is sufficient to fool the DNN and to cause misclassification. The perturbed image, called adversarial example, should be visually as close as possible to the original image. However, all the works proposed in the literature for generating adversarial examples have used the Lp norms (L0, L2 and L∞) as distance metrics to quantify the similarity between the original image and the adversarial example. Nonetheless, the Lp norms do not correlate with human judgment, making them not suitable to reliably assess the perceptual similarity/fidelity of adversarial examples. In this paper, we present a database for visual fidelity assessment of adversarial examples. We describe the creation of the database and evaluate the performance of fifteen state-ofthe-art full-reference (FR) image fidelity assessment metrics that could substitute Lp norms. The database as well as subjective scores are publicly available to help designing new metrics for adversarial examples and to facilitate future research works. 1
Given their outstanding performance, the Deep Neural Networks (DNNs) models have been deployed in many real-world applications. However, recent studies have demonstrated that they are vulnerable to small carefully crafted perturbations, i.e., adversarial examples, which considerably decrease their performance and can lead to devastating consequences, especially for safety-critical applications, such as autonomous vehicles, healthcare and face recognition. Therefore, it is of paramount importance to offer defense solutions that increase the robustness of DNNs against adversarial attacks. In this paper, we propose a novel defense solution based on a Deep Denoising Sparse Autoencoder (DDSA). The proposed method is performed as a pre-processing step, where the adversarial noise of the input samples is removed before feeding the classifier. The pre-processing defense block can be associated with any classifier, without any change to their architecture or training procedure. In addition, the proposed method is a universal defense, since it does not require any knowledge about the attack, making it usable against any type of attack. The experimental results on MNIST and CIFAR-10 datasets have shown that the proposed DDSA defense provides a high robustness against a set of prominent attacks under white-, gray-and black-box settings, and outperforms state-of-the-art defense methods. INDEX TERMS Deep neural network, security, adversarial attacks, defense, sparse autoencoder, denoising.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.