2021
DOI: 10.1609/aaai.v35i16.17685
|View full text |Cite
|
Sign up to set email alerts
|

Building Interpretable Interaction Trees for Deep NLP Models

Abstract: This paper proposes a method to disentangle and quantify interactions among words that are encoded inside a DNN for natural language processing. We construct a tree to encode salient interactions extracted by the DNN. Six metrics are proposed to analyze properties of interactions between constituents in a sentence. The interaction is defined based on Shapley values of words, which are considered as an unbiased estimation of word contributions to the network prediction. Our method is used to quantify word inter… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
1
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 11 publications
(2 citation statements)
references
References 40 publications
0
1
0
Order By: Relevance
“…Additionally, using hierarchical representations to analyze DNN has become an area of interest in the NLP community. Zhang et al [93] constructed a tree to encode salient interactions extracted by DNN, on the basis of Shapley values of words [50]. Considering that displaying the original attribution matrix in the case of long text causes visual clutter and that most elements of the attribution matrix are minimal (close to zero), we adopt the tree generation algorithm proposed by Hao et al [25] to display the information flow inside the module (R4).…”
Section: Exploring Layer-level Information Flowmentioning
confidence: 99%
“…Additionally, using hierarchical representations to analyze DNN has become an area of interest in the NLP community. Zhang et al [93] constructed a tree to encode salient interactions extracted by DNN, on the basis of Shapley values of words [50]. Considering that displaying the original attribution matrix in the case of long text causes visual clutter and that most elements of the attribution matrix are minimal (close to zero), we adopt the tree generation algorithm proposed by Hao et al [25] to display the information flow inside the module (R4).…”
Section: Exploring Layer-level Information Flowmentioning
confidence: 99%
“…Sikdar et al (2021) compute importance scores in a bottom-up manner starting from the individual embedding dimensions, working its way up to tokens, words, phrases, and finally the sentence. Zhang et al (2021) build interpretable interaction trees, where the interaction is again defined based on Shapley values. While these methods produce spans of tokens that are part of an interaction, the hierarchical nature of the explanation limits the interactions only to neighboring spans.…”
Section: Related Workmentioning
confidence: 99%