Interpreting Attributions and Interactions of Adversarial Attacks

Wang, Xin; Lin, Shuyun; Zhang, Hao; Zhu, Yufei; Zhang, Quanshi

doi:10.1109/iccv48922.2021.00113

Cited by 11 publications

(11 citation statements)

References 19 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The meta gradient adversarial attack (MGAA) method [227] utilized the meta learning to learn a generalized meta gradient by treating the attack against one model as one individual task, such that the meta gradient can be quickly fine-tuned to find the effective adversarial perturbations for new models. [187] empirically verified that "the adversarial transferability and the interactions inside adversarial perturbations are negatively correlated", and proposed an interaction loss to generate high transferable perturbations. In addition to above loss functions defined based on intermediate layer features, the reverse adversarial perturbation (RAP) attack [152] proposed a novel min-max loss function, where the adversarial example was perturbed by maximizing the adversarial loss, i.e., adding a reverse adversarial perturbation.…”

Section: Model-level Adversarial Transferabilitymentioning

confidence: 90%

Adversarial Machine Learning: A Systematic Survey of Backdoor Attack, Weight Attack and Adversarial Example

Liu¹,

Zhu²,

Liu³

et al. 2023

Preprint

View full text Add to dashboard Cite

Adversarial machine learning (AML) studies the adversarial phenomenon of machine learning, which may make inconsistent or unexpected predictions with humans. Some paradigms have been recently developed to explore this adversarial phenomenon occurring at different stages of a machine learning system, such as training-time adversarial attack (i.e., backdoor attack), deployment-time adversarial attack (i.e., weight attack), and inference-time adversarial attack (i.e., adversarial example). However, although these paradigms share a common goal, their developments are almost independent, and there is still no big picture of AML. In this work, we aim to provide a unified perspective to the AML community to systematically review the overall progress of this field. We firstly provide a general definition about AML, and then propose a unified mathematical framework to covering existing attack paradigms. According to the proposed unified framework, we can not only clearly figure out the connections and differences among these paradigms, but also systematically categorize and review existing works in each paradigm.

show abstract

Section: Model-level Adversarial Transferabilitymentioning

confidence: 90%

Adversarial Machine Learning: A Systematic Survey of Backdoor Attack, Weight Attack and Adversarial Example

Liu¹,

Zhu²,

Liu³

et al. 2023

Preprint

View full text Add to dashboard Cite

show abstract

“…Such dilemma motivates research on explanation techniques for DL models [40,76], aiming to explain DL models' decisions [5] and understand adversarial attacks [15,64] as well as defenses [75], thereby paving the way for building secure and trustworthy models. Explanation methods can be categorized as global explanation and local explanation in terms of the analysis object [12].…”

Section: Background 21 Explanation On Dnnmentioning

confidence: 99%

“…All the backdoors can achieve a high success rate in our evaluation. Explanations can be used in a wide range of applications, which include but are not limited to explaining model decisions [14], understanding adversarial attacks [64] and defenses [50], etc. Further, by assessing faithfulness, consistency between explanation methods, models, and humans can be achieved.…”

Section: Limitations and Benefitsmentioning

confidence: 99%

“…That is, the explanation should be meaningful to humans and correspond to the model's behavior in the vicinity of the instance being predicted. The risks of deep learning models further propel the advance of explanation methods, which are popularly used to build secure and trustworthy models [12], such as model debugging [5,71], understanding attacks [55,64] and defenses [50] of DL models.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Good-looking but Lacking Faithfulness: Understanding Local Explanation Methods through Trend-based Testing

He,

Chen,

Meng

et al. 2023

Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security

View full text Add to dashboard Cite

While enjoying the great achievements brought by deep learning (DL), people are also worried about the decision made by DL models, since the high degree of non-linearity of DL models makes the decision extremely difficult to understand. Consequently, attacks such as adversarial attacks are easy to carry out, but difficult to detect and explain, which has led to a boom in the research on local explanation methods for explaining model decisions. In this paper, we evaluate the faithfulness of explanation methods and find that traditional tests on faithfulness encounter the random dominance problem, i.e., the random selection performs the best, especially for complex data. To further solve this problem, we propose three trend-based faithfulness tests and empirically demonstrate that the new trend tests can better assess faithfulness than traditional tests on image, natural language and security tasks. We implement the assessment system and evaluate ten popular explanation methods. Benefiting from the trend tests, we successfully assess the explanation methods on complex data for the first time, bringing unprecedented discoveries and inspiring future research. Downstream tasks also greatly benefit from the tests. For example, model debugging equipped with faithful explanation methods performs much better for detecting and correcting accuracy and security problems. CCS CONCEPTS• Security and privacy → Software and application security.

show abstract

“…However, Szegedy et al [46] found that DNNs are vulnerable to adversarial examples, i.e., the maliciously crafted inputs that are indistinguishable from the correctly classified images but can induce misclassification on the target model. Such vulnerability poses significant threats when applying DNNs to security-critical applications, which also attracts broad attention to the security of DNNs [10,14,49,55,64,65].…”

Section: Introductionmentioning

confidence: 99%

Generating realistic building electrical load profiles through the Generative Adversarial Network (GAN)

Wang

Hong

2020

Energy and Buildings

View full text Add to dashboard Cite

Building electrical load profiles can improve understanding of building energy efficiency, demand flexibility, and building-grid interactions. Current approaches to generating load profiles are timeconsuming and not capable of reflecting the dynamic and stochastic behaviors of real buildings; some approaches also trigger data privacy concerns. In this study, we proposed a novel approach for generating realistic electrical load profiles of buildings through the Generative Adversarial Network (GAN), a machine learning technique that is capable of revealing an unknown probability distribution purely from data. The proposed approach has three main steps: (1) normalizing the daily 24-hour load profiles, (2) clustering the daily load profiles with the k-means algorithm, and (3) using GAN to generate daily load profiles for each cluster. The approach was tested with an open-source databasethe Building Data Genome Project. We validated the proposed method by comparing the mean, standard deviation, and distribution of key parameters of the generated load profiles with those of the real ones. The KL divergence of the generated and real load profiles are within 0.3 for majority of parameters and clusters. Additionally, results showed the load profiles generated by GAN can capture not only the general trend but also the random variations of the actual electrical loads in buildings. The proposed GAN approach can be used to generate building electrical load profiles, verify other load profile generation models, detect changes to load profiles, and more importantly, anonymize smart meter data for sharing, to support research and applications of grid-interactive efficient buildings.

show abstract

Interpreting Attributions and Interactions of Adversarial Attacks

Cited by 11 publications

References 19 publications

Adversarial Machine Learning: A Systematic Survey of Backdoor Attack, Weight Attack and Adversarial Example

Adversarial Machine Learning: A Systematic Survey of Backdoor Attack, Weight Attack and Adversarial Example

Good-looking but Lacking Faithfulness: Understanding Local Explanation Methods through Trend-based Testing

Generating realistic building electrical load profiles through the Generative Adversarial Network (GAN)

Contact Info

Product

Resources

About