Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP 2018
DOI: 10.18653/v1/w18-5429
|View full text |Cite
|
Sign up to set email alerts
|

Importance of Self-Attention for Sentiment Analysis

Abstract: Despite their superior performance, deep learning models often lack interpretability. In this paper, we explore the modeling of insightful relations between words, in order to understand and enhance predictions. To this effect, we propose the Self-Attention Network (SANet), a flexible and interpretable architecture for text classification. Experiments indicate that gains obtained by self-attention is task-dependent. For instance, experiments on sentiment analysis tasks showed an improvement of around 2% when u… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
18
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 52 publications
(19 citation statements)
references
References 18 publications
(17 reference statements)
0
18
0
Order By: Relevance
“…An empirical evaluation is thus beyond the scope of this article. There are, however, a number of experimental studies focused on particular NLP tasks, including machine translation [37], [42], [48], [132], argumentation mining [125], text summarization [58], and sentiment analysis [7]. It is worthwhile remarking that, on several occasions, attention-based approaches enabled a dramatic development of entire research lines.…”
Section: Introductionmentioning
confidence: 99%
“…An empirical evaluation is thus beyond the scope of this article. There are, however, a number of experimental studies focused on particular NLP tasks, including machine translation [37], [42], [48], [132], argumentation mining [125], text summarization [58], and sentiment analysis [7]. It is worthwhile remarking that, on several occasions, attention-based approaches enabled a dramatic development of entire research lines.…”
Section: Introductionmentioning
confidence: 99%
“…Talking about this paradigm, various works focus on the weights of the attention layer in transformers [56] or other kinds of networks, such as the recurrent or the convolutional ones, to highlight the words or n-grams in the text that are the most relevant for the decision. Regarding the sentiment analysis task, authors in [57] observed a strong interaction between neighboring words visualizing the attention matrix of a transformerlike network. Furthermore, in [58], the authors of the work discussed the use of attention scores from an attention layer as a good and less computationally burdensome alternative to external explainer models like LIME [59,60] and integrated gradients [61] methods.…”
Section: Attention As Explanationmentioning
confidence: 99%
“…For text classification which only has single input sequence, attention based models mainly focus on applying attention mechanism on top of CNN or RNN for selecting the more important information (Yang et al, 2016;Er et al, 2016). Letarte et al (2018) and Shen et al (2018) also explore self-attention networks which is CNN/RNN free.…”
Section: Modesmentioning
confidence: 99%
“…For compared previous models, first block lists n-grams based models including bigram-FastText (Joulin et al, 2016) and region embedding (Qiao et al, 2018). Self-attention Networks SANet (Letarte et al, 2018) is reported in the second block. RNN based models LSTM (Zhang et al, 2015), D-LSTM (Yogatama et al, 2017) and CNN based models char-CNN (Zhang et al, 2015) and VDCNN (Conneau et al, 2016) are listed in third and forth block respectively.…”
Section: Experiments Settingsmentioning
confidence: 99%