2021
DOI: 10.1109/tnnls.2020.3019893
|View full text |Cite
|
Sign up to set email alerts
|

Attention in Natural Language Processing

Abstract: Attention is an increasingly popular mechanism used in a wide range of neural architectures. The mechanism itself has been realized in a variety of formats. However, because of the fast-paced advances in this domain, a systematic overview of attention is still missing. In this article, we define a unified model for attention architectures in natural language processing, with a focus on those designed to work with vector representations of the textual data. We propose a taxonomy of attention models according to… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
130
0
2

Year Published

2021
2021
2022
2022

Publication Types

Select...
8
2

Relationship

0
10

Authors

Journals

citations
Cited by 414 publications
(182 citation statements)
references
References 137 publications
1
130
0
2
Order By: Relevance
“…The attention mechanism (the idea of focusing on specific parts of the input) has been applied in deep learning for speech recognition [30], Natural Language processing [31], multimodal reasoning and matching [32], object detection [33], and image recognition [34]- [36]. In remote sensing, some works that use attention are proposed for RS object detection [37], RS image segmentation [38], [39], and RS scene classification [40]- [49].…”
Section: Figurementioning
confidence: 99%
“…The attention mechanism (the idea of focusing on specific parts of the input) has been applied in deep learning for speech recognition [30], Natural Language processing [31], multimodal reasoning and matching [32], object detection [33], and image recognition [34]- [36]. In remote sensing, some works that use attention are proposed for RS object detection [37], RS image segmentation [38], [39], and RS scene classification [40]- [49].…”
Section: Figurementioning
confidence: 99%
“…In the publications that gather information about deep learning models with attention mechanisms, we can mention the work of Galassi et al [2]. This work presented a systematic overview to define a unified model for attention architectures in Natural Language Processing (NLP), focusing on those designed to work with vector representations of textual data.…”
Section: Related Workmentioning
confidence: 99%
“…The self-attention mechanism is essentially a special case of the attention model. The unified attention model contains three types of inputs: key, value, and query [42], as depicted in Figure 4. The key and the value are a pair of data representations.…”
Section: Rcsa Mechanismmentioning
confidence: 99%