Mengnan Du scite author profile

Interpretable machine learning tackles the important problem that humans cannot understand the behaviors of complex machine learning models and how these models arrive at a particular decision. Although many approaches have been proposed, a comprehensive understanding of the achievements and challenges is still lacking. We provide a survey covering existing techniques to increase the interpretability of machine learning models. We also discuss crucial issues that the community should consider in future work such as designing user-friendly explanations and developing comprehensive evaluation metrics to further push forward the area of interpretable machine learning.

show abstract

Towards Explanation of DNN-based Prediction with Guided Feature Inversion

Liu

Song

et al. 2018

View full text Add to dashboard Cite

While deep neural networks (DNN) have become an effective computational tool, the prediction results are often criticized by the lack of interpretability, which is essential in many real-world applications such as health informatics. Existing attempts based on local interpretations aim to identify relevant features contributing the most to the prediction of DNN by monitoring the neighborhood of a given input. They usually simply ignore the intermediate layers of the DNN that might contain rich information for interpretation. To bridge the gap, in this paper, we propose to investigate a guided feature inversion framework for taking advantage of the deep architectures towards effective interpretation. The proposed framework not only determines the contribution of each feature in the input but also provides insights into the decision-making process of DNN models. By further interacting with the neuron of the target category at the output layer of the DNN, we enforce the interpretation result to be class-discriminative. We apply the proposed interpretation model to different CNN architectures to provide explanations for image data and conduct extensive experiments on ImageNet and PASCAL VOC07 datasets. The interpretation results demonstrate the effectiveness of our proposed framework in providing class-discriminative interpretation for DNN-based prediction.Deep neural networks (DNN) have achieved extremely high prediction accuracy in a wide range of fields such as computer vision [16,21,37], natural language processing [43], and recommender systems [17]. Despite the superior performance, DNN models are often regarded as black-boxes, since these models cannot

show abstract

Fairness in Deep Learning: A Computational Perspective

et al. 2021

View full text Add to dashboard Cite

Deep learning is increasingly being used in high-stake decision making applications that affect individual lives. However, deep learning models might exhibit algorithmic discrimination behaviors with respect to protected groups, potentially posing negative impacts on individuals and society. Therefore, fairness in deep learning has attracted tremendous attention recently. We provide a comprehensive review covering existing techniques to tackle algorithmic fairness problems from the computational perspective. Specifically, we show that interpretability can serve as a useful ingredient, which could be augmented into the biases detection and mitigation pipelines. We also discuss open research problems and future research directions, aiming to push forward the area of fairness in deep learning and build genuinely fair, accountable, and transparent deep learning systems.

show abstract

Towards Generalizable Deepfake Detection with Locality-aware AutoEncoder

Pentyala

et al. 2020

View full text Add to dashboard Cite

MicroRNA-19b-3p regulates nasopharyngeal carcinoma radiosensitivity by targeting TNFAIP3/NF-κB axis

Huang

et al. 2016

J Exp Clin Cancer Res

View full text Add to dashboard Cite

BackgroundNasopharyngeal carcinoma (NPC) is among the most common squamous cell carcinoma in South China and Southeast Asia. Radiotherapy is the primary treatment for NPC. However, radioresistance acts as a significant factor that limits the efficacy of radiotherapy for NPC patients. Growing evidence supports that microRNAs (miRNAs) play an important role in radiation response.MethodsReal-time quantitative PCR was used to analyze the expression of miR-19b-3p in NPC cell lines and NP69. miR-19b-3p expression profiles in NPC tissues were obtained from the Gene Expression Omnibus database. The effect of miR-19b-3p on radiosensitivity was evaluated by cell viability assays, colony formation assays and in vivo experiment. Apoptosis and cell cycle were examined by flow cytometry. Luciferase reporter assay was used to assess the target genes of miR-19b-3p. Expression of target proteins and downstream molecules were analyzed by Western blot.ResultsmiR-19b-3p was upregulated in NPC and served as an independent predictor for reduced patient survival. Radioresponse assays showed that miR-19b-3p overexpression resulted in decreased sensitivity to irradiation, whereas miR-19b-3p downregulation resulted in increased sensitivity to irradiation in vitro. Moreover, miR-19b-3p decreased the sensitivity of NPC cells to irradiation in vivo. Luciferase reporter assay confirmed that TNFAIP3 was a direct target gene of miR-19b-3p. Knockdown of TNFAIP3 reduced sensitivity to irradiation, whereas upregulation of TNFAIP3 expression reversed the inhibitory effects of miR-19b-3p on NPC cell radiosensitivity. Mechanistically, we found that miR-19b-3p increased NPC cell radioresistance by activating the TNFAIP3/ NF-κB axis.ConclusionsmiR-19b-3p contributes to the radioresistance of NPC by activating the TNFAIP3/ NF-κB axis. miR-19b-3p is a determinant of NPC radioresponse and may serve as a potential therapeutic target in NPC treatment.Electronic supplementary materialThe online version of this article (doi:10.1186/s13046-016-0465-1) contains supplementary material, which is available to authorized users.

show abstract

An Embarrassingly Simple Approach for Trojan Attack in Deep Neural Networks

Tang

Liu

et al. 2020

View full text Add to dashboard Cite

With the widespread use of deep neural networks (DNNs) in highstake applications, the security problem of the DNN models has received extensive attention. In this paper, we investigate a specific security problem called trojan attack, which aims to attack deployed DNN systems relying on the hidden trigger patterns inserted by malicious hackers. We propose a training-free attack approach which is different from previous work, in which trojaned behaviors are injected by retraining model on a poisoned dataset. Specifically, we do not change parameters in the original model but insert a tiny trojan module (TrojanNet) into the target model. The infected model with a malicious trojan can misclassify inputs into a target label when the inputs are stamped with the special trigger. The proposed TrojanNet has several nice properties including (1) it activates by tiny trigger patterns and keeps silent for other signals, (2) it is model-agnostic and could be injected into most DNNs, dramatically expanding its attack scenarios, and (3) the training-free mechanism saves massive training efforts comparing to conventional trojan attack methods. The experimental results show that TrojanNet can inject the trojan into all labels simultaneously (all-label trojan attack) and achieves 100% attack success rate without affecting model accuracy on original tasks. Experimental analysis further demonstrates that state-of-the-art trojan detection algorithms fail to detect TrojanNet attack. The code is available at https://github.com/trx14/TrojanNet. CCS CONCEPTS• Security and privacy → Malware and its mitigation;

show abstract

On Attribution of Recurrent Neural Network Predictions via Additive Decomposition

Liu

Yang

et al. 2019

View full text Add to dashboard Cite

RNN models have achieved the state-of-the-art performance in a wide range of text mining tasks. However, these models are often regarded as black-boxes and are criticized due to the lack of interpretability. In this paper, we enhance the interpretability of RNNs by providing interpretable rationales for RNN predictions. Nevertheless, interpreting RNNs is a challenging problem. Firstly, unlike existing methods that rely on local approximation, we aim to provide rationales that are more faithful to the decision making process of RNN models. Secondly, a flexible interpretation method should be able to assign contribution scores to text segments of varying lengths, instead of only to individual words. To tackle these challenges, we propose a novel attribution method, called REAT, to provide interpretations to RNN predictions. REAT decomposes the final prediction of a RNN into additive contribution of each word in the input text. This additive decomposition enables REAT to further obtain phrase-level attribution scores. In addition, REAT is generally applicable to various RNN architectures, including GRU, LSTM and their bidirectional versions. Experimental results demonstrate the faithfulness and interpretability of the proposed attribution method. Comprehensive analysis shows that our attribution method could unveil the useful linguistic knowledge captured by RNNs. Some analysis further demonstrates our method could be utilized as a debugging tool to examine the vulnerability and failure reasons of RNNs, which may lead to several promising future directions to promote generalization ability of RNNs.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

334 Leonard St

Brooklyn, NY 11211

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Mengnan Du

Score-CAM: Score-Weighted Visual Explanations for Convolutional Neural Networks

Techniques for interpretable machine learning

Towards Explanation of DNN-based Prediction with Guided Feature Inversion

Fairness in Deep Learning: A Computational Perspective

Towards Generalizable Deepfake Detection with Locality-aware AutoEncoder

MicroRNA-19b-3p regulates nasopharyngeal carcinoma radiosensitivity by targeting TNFAIP3/NF-κB axis

An Embarrassingly Simple Approach for Trojan Attack in Deep Neural Networks

On Attribution of Recurrent Neural Network Predictions via Additive Decomposition

Contact Info

Product

Resources

About