Perturbing Inputs for Fragile Interpretations in Deep Natural Language Processing

Sinha, Sanchit; Chen, Hanjie; Sekhon, Arshdeep; Ji, Yangfeng; Qi, Yanjun

doi:10.18653/v1/2021.blackboxnlp-1.33

Cited by 9 publications

(7 citation statements)

References 27 publications

(16 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…• 𝐿 𝑝 Distance: An intuitive and straightforward metric to compare two explanation maps is to compute the normed distance between them. Some of the widely used metrics are median 𝐿 1 distance, used in [15], and the 𝐿 2 distance [17,36,107]. Mean Squared Error (MSE), a metric derived from 𝐿 2 distance, is also very popular.…”

Section: Evaluating the Robustness Of Explanation Methodsmentioning

confidence: 99%

“…From another perspective, while there have been many surveys of literature on adversarial attacks and robustness [7,8,11,25,29,35,46,51,57,61,65,69,75,77,101,104,112,113,116,118,119,121,122,129,135] -which focus on attacking the predictive outcome of these models, there have been no effort so far to study and consolidate existing efforts on attacks on explainability of DNN models. Many recent efforts have demonstrated the vulnerability of explanations (or attributions 1 ) to human-imperceptible input perturbations across image, text and tabular data [36,45,55,62,107,108,133]. Similarly, there have also been many efforts in recent years in securing the stability of such explanations in [13,26,30,36,37,50,54,73,97,99,106,125].…”

Section: Introductionmentioning

confidence: 99%

“…Text Data. Sinha et al [107] proposed an attack on model-agnostic methods using an innovative approach of employing traditional techniques in model-specific attacks. The idea is similar to Ivankay et al [55], where the ranking of words is initially calculated through the leave-one-out approach.…”

mentioning

confidence: 99%

See 2 more Smart Citations

On the Robustness of Explanations of Deep Neural Network Models: A Survey

Jyoti¹,

Ganesh²,

Gayala³

et al. 2022

Preprint

View full text Add to dashboard Cite

Explainability has been widely stated as a cornerstone of the responsible and trustworthy use of machine learning models. With the ubiquitous use of Deep Neural Network (DNN) models expanding to risk-sensitive and safety-critical domains, many methods have been proposed to explain the decisions of these models. Recent years have also seen concerted efforts that have shown how such explanations can be distorted (attacked) by minor input perturbations. While there have been many surveys that review explainability methods themselves, there has been no effort hitherto to assimilate the different methods and metrics proposed to study the robustness of explanations of DNN models. In this work, we present a comprehensive survey of methods that study, understand, attack, and defend explanations of DNN models. We also present a detailed review of different metrics used to evaluate explanation methods, as well as describe attributional attack and defense methods. We conclude with lessons and take-aways for the community towards ensuring robust explanations of DNN model predictions.

show abstract

Section: Evaluating the Robustness Of Explanation Methodsmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

On the Robustness of Explanations of Deep Neural Network Models: A Survey

Jyoti¹,

Ganesh²,

Gayala³

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

“…(Ghorbani, Abid, and Zou 2019) showed that explanations can easily be misled by introducing imperceptible noise in the input image. Several other works have highlighted similar problems on vision, natural language and reinforcement learning such as (Adebayo et al 2018;Dombrowski et al 2019;Slack et al 2020;Kindermans et al 2019;Sinha et al 2021;Huai et al 2020). Similarly, concept explanation methods are also fragile to small perturbations to input samples (Brown and Kvinge 2021).…”

Section: Related Workmentioning

confidence: 97%

Understanding and Enhancing Robustness of Concept-Based Models

Sinha

Huai

Sun

et al. 2023

AAAI

View full text Add to dashboard Cite

Rising usage of deep neural networks to perform decision making in critical applications like medical diagnosis and fi- nancial analysis have raised concerns regarding their reliability and trustworthiness. As automated systems become more mainstream, it is important their decisions be transparent, reliable and understandable by humans for better trust and confidence. To this effect, concept-based models such as Concept Bottleneck Models (CBMs) and Self-Explaining Neural Networks (SENN) have been proposed which constrain the latent space of a model to represent high level concepts easily understood by domain experts in the field. Although concept-based models promise a good approach to both increasing explainability and reliability, it is yet to be shown if they demonstrate robustness and output consistent concepts under systematic perturbations to their inputs. To better understand performance of concept-based models on curated malicious samples, in this paper, we aim to study their robustness to adversarial perturbations, which are also known as the imperceptible changes to the input data that are crafted by an attacker to fool a well-learned concept-based model. Specifically, we first propose and analyze different malicious attacks to evaluate the security vulnerability of concept based models. Subsequently, we propose a potential general adversarial training-based defense mechanism to increase robustness of these systems to the proposed malicious attacks. Extensive experiments on one synthetic and two real-world datasets demonstrate the effectiveness of the proposed attacks and the defense approach. An appendix of the paper with more comprehensive results can also be viewed at https://arxiv.org/abs/2211.16080.

show abstract

“…Related work on concept-level explanations. Recent research has focused on designing concept-based deep learning methods to interpret how deep learning models can use highlevel human-understandable concepts in arriving at decisions [Ghorbani et al, 2019;Wu et al, 2020;Koh et al, 2020;Yeh et al, 2019;Mincu et al, 2021;Huang et al, 2022;Leemann et al, 2022;Sinha et al, 2021;Sinha et al, 2023]. Such concept-based deep learning models aim to incorporate high-level concepts into the learning procedure.…”

Section: Related Workmentioning

confidence: 99%

The role of West Virginia University in the economic development of West Virginia: A general equilibrium approach

Sinha¹

View full text Add to dashboard Cite

Universities, industries and government are major institutions and play an important role in local economic growth and development. They often work collectively in achieving these goals. While the role of institutions in economic development is well-founded in the literature, a university's strategic role as an institution to encourage local area development is only now being recognized. The existing literature recognizes the spillover effect of mainly university research on local industrial productivity and innovations at the national level. The impact of various university activities on numerous other local economic development indicators like per capita income, poverty, school enrollments etc., at county level has not been attempted. That is, the impact of a university on the local economic development beyond industry research and development and the interaction between the major institutions in this context-namely, university, local business, local community and government remains unexplored. The identification of these interactions have the potential to guide policy makers in devising economic policies and formulating budgetary plans. To extend the existing literature, this research develops a static general equilibrium model of the local economy to assess the impact of diverse activities of West Virginia University, using the holistic approach suggested by Hoffman and Hill (2009) and other existing literature. The analysis is based on the county level data of West Virginia for 55 counties over the period of seven years from 2001-2007. The study controls for other influencing factors on economic development like community, industry and government. Based on previous studies, the analysis identifies spatial dependence as a factor of university spillovers to local areas. Thus, the study uses both spatial and non-spatial models for analysis. The non-spatial models employed consist of Least Square Dummy Variable regression (LSDV), Fixed Effects Panel Regression and Seemingly Unrelated Regressions (SUR) Panel Fixed Effects Model. Following Elhorst's (2010) testing procedure, the Spatial Durbin Model (SDM) is employed for spatial analysis. The results are interpreted for policy analysis considering the limitations of the model and the available data. The results find positive university spillover on the economic development indicators: Per Capita Income, Poverty, Public School Enrollment, Patents, Industrial Wages and Earnings. The empirical models estimated in this study identify the various channels through which a university impacts the local economy. The study concludes, the impact of West Virginia University has a significant influence on all the economic development parameters measured in this analysis. This positive stimulus will be larger with greater industrial collaborations in terms of research. Furthermore, broader government support in terms of grants for high-tech research and development and providing a positive economic environment for fortifying industrial connections will result in sustainable eco...

show abstract

Perturbing Inputs for Fragile Interpretations in Deep Natural Language Processing

Cited by 9 publications

References 27 publications

On the Robustness of Explanations of Deep Neural Network Models: A Survey

On the Robustness of Explanations of Deep Neural Network Models: A Survey

Understanding and Enhancing Robustness of Concept-Based Models

The role of West Virginia University in the economic development of West Virginia: A general equilibrium approach

Contact Info

Product

Resources

About