Modern deep neural networks are highly vulnerable to adversarial examples, which attracts more and more researchers' attention to craft powerful adversarial examples. Most of these generation algorithms create global perturbations that would affect the visual quality of adversarial examples. To mitigate such drawbacks, some attacks attempt to generate local perturbations. However, existing local adversarial attacks are time‐consuming and the generated adversarial examples are still distinguishable from clean images. In this paper, we propose a novel efficient local adversarial attack (ELAA) using model interpreters to generate severe local perturbations and improve the imperceptibly of the generated adversarial examples. Specifically, we take advantage of model interpretation methods to search the discriminative regions of clean images. Then, we generate local adversarial examples by adding masks to original clean images. We also propose a new optimization method to reduce the redundancy of local perturbations. Through extensive experiments, we show our ELAA can maintain a high attack ability while preserving the visual quality of clean images. Experimental results also demonstrate our local attack outperforms state‐of‐the‐art local attack methods under various system settings.
Adversarial attacks have threatened the application of deep neural networks in security-sensitive scenarios. Most existing blackbox attacks fool the target model by interacting with it many times and producing global perturbations. However, global perturbations change the smooth and insignificant background, which not only makes the perturbation more easily be perceived but also increases the query overhead. In this paper, we propose a novel framework to perturb the discriminative areas of clean examples only within limited queries in black-box attacks. Our framework is constructed based on two types of transferability. The first one is the transferability of model interpretations. Based on this property, we identify the discriminative areas of a given clean example easily for local perturbations. The second is the transferability of adversarial examples. It helps us to produce a local pre-perturbation for improving query efficiency. After identifying the discriminative areas and pre-perturbing, we generate the final adversarial examples from the pre-perturbed example by querying the targeted model with two kinds of black-box attack techniques, i.e., gradient estimation and random search. We conduct extensive experiments to show that our framework can significantly improve the query efficiency during black-box perturbing with a high attack success rate. Experimental results show that our attacks outperform state-of-the-art black-box attacks under various system settings.
Unrestricted adversarial examples allow the attacker to start attacks without given clean samples, which are quite aggressive and threatening. However, existing works for generating unrestricted adversary examples are quite inefficient and cannot achieve a high success rate. In this paper, we explore an end-to-end and effective solution for unrestricted adversary example generation. To stabilize the training process and make our generative model converge to satisfactory results, we design a novel decoupled two-step efficient generative model (EGM), which contains a conditional reference generator and a conditional adversarial transformer. The former is responsible for generating reference samples from noises and source classes. The latter is responsible for converting the reference sample into adversarial examples corresponding to target classes. To improve the success rate, we design a new strategy, augmentation of adversarial labels, to produce dynamic target labels and enhance the exploration ability of EGM. Such a strategy can be also applied to existing attacks to improve their attack success rates, which is of independent interest. We conduct extensive experiments to evaluate our proposed model and demonstrate the necessity of decoupling the generation process in EGM. Experimental results show our EGM is much faster and achieves a higher success rate than the state-of-the-art attacks.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.