2021
DOI: 10.48550/arxiv.2108.13138
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Neuron-level Interpretation of Deep NLP Models: A Survey

Abstract: The proliferation of deep neural networks in various domains has seen an increased need for interpretability of these methods. A plethora of research has been carried out to analyze and understand components of the deep neural network models. Preliminary work done along these lines and papers that surveyed such, were focused on a more high-level representation analysis. However, a recent branch of work has concentrated on interpretability at a more granular level, analyzing neurons and groups of neurons in the… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
3

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(4 citation statements)
references
References 24 publications
0
4
0
Order By: Relevance
“…See also a number of previous surveys and critiques of interpretability work that have overlap with ours [3], [58], [60], [68], [95], [118], [136], [173]- [175], [208], [215], [218], [219]. This survey, however, is distinct in its focus on inner interpretability, AI safety, and the intersections between interpretability and several other research paradigms.…”
Section: Scope and Taxonomymentioning
confidence: 98%
“…See also a number of previous surveys and critiques of interpretability work that have overlap with ours [3], [58], [60], [68], [95], [118], [136], [173]- [175], [208], [215], [218], [219]. This survey, however, is distinct in its focus on inner interpretability, AI safety, and the intersections between interpretability and several other research paradigms.…”
Section: Scope and Taxonomymentioning
confidence: 98%
“…The closest survey related to our work is from Sajjad et al [25], where the survey is on fine-grained neuron analysis. While there have been two previous surveys that cover Concept Analysis [26] and Attribution Analysis [24], their focus is on analyzing individual neurons to better understand the inner workings of neural networks.…”
Section: Related Surveysmentioning
confidence: 99%
“…A common observation that we see in the contemporary general surveys and from our focused reviews is the lack of both theoretical foundations and empirical considerations in evaluations [25,23,24]. Even though each method has quantitative measures for evaluation, there is no standard set of metrics for comparing various observations, hence, confining the scope of respective interpretability technique results to specific model architectures or task-related domains.…”
Section: Insights and Future Directionsmentioning
confidence: 99%
“…Moreover, enforcing neuron activation sparsity in MLPs helps to improve the percentage of neurons that are interpretable [49]. Hence, our discovery may point to new directions towards developing more interpretable DNNs [50,51].…”
Section: Sparsity For Robustnessmentioning
confidence: 99%