Attention in Natural Language Processing

Galassi, Andrea; Lippi, Marco; Torroni, Paolo

doi:10.1109/tnnls.2020.3019893

Cited by 414 publications

(182 citation statements)

References 137 publications

Supporting

Mentioning

130

Contrasting

Unclassified

Order By: Relevance

“…The attention mechanism (the idea of focusing on specific parts of the input) has been applied in deep learning for speech recognition [30], Natural Language processing [31], multimodal reasoning and matching [32], object detection [33], and image recognition [34]- [36]. In remote sensing, some works that use attention are proposed for RS object detection [37], RS image segmentation [38], [39], and RS scene classification [40]- [49].…”

Section: Figurementioning

confidence: 99%

Classification of Remote Sensing Images Using EfficientNet-B3 CNN Model With Attention

et al. 2021

View full text Add to dashboard Cite

Scene classification is a highly useful task in Remote Sensing (RS) applications. Many efforts have been made to improve the accuracy of RS scene classification. Scene classification is a challenging problem, especially for large datasets with tens of thousands of images with a large number of classes and taken under different circumstances. One problem that is observed in scene classification is the fact that for a given scene, only one part of it indicates which class it belongs to, whereas the other parts are either irrelevant or they actually tend to belong to another class. To address this issue, this paper proposes a deep attention Convolutional Neural Network (CNN) for scene classification in remote sensing. CNN models use successive convolutional layers to learn feature maps from larger and larger regions (or receptive fields) of the scene. The attention mechanism computes a new feature map as a weighted average of these original feature maps. In particular, we propose a solution, named EfficientNet-B3-Attn-2, based on the pre-trained EfficientNet-B3 CNN enhanced with an attention mechanism. A dedicated branch is added to layer 262 of the network, to compute the required weights. These weights are learned automatically by training the whole CNN model end-to-end using the backpropagation algorithm. In this way, the network learns to emphasize important regions of the scene and suppress the regions that are irrelevant to the classification. We tested the proposed EfficientNet-B3-Attn-2 on six popular remote sensing datasets, namely UC Merced, KSA, OPTIMAL-31, RSSCN7, WHU-RS19, and AID datasets, showing its strong capabilities in classifying RS scenes.

show abstract

Section: Figurementioning

confidence: 99%

Classification of Remote Sensing Images Using EfficientNet-B3 CNN Model With Attention

et al. 2021

View full text Add to dashboard Cite

show abstract

“…In the publications that gather information about deep learning models with attention mechanisms, we can mention the work of Galassi et al [2]. This work presented a systematic overview to define a unified model for attention architectures in Natural Language Processing (NLP), focusing on those designed to work with vector representations of textual data.…”

Section: Related Workmentioning

confidence: 99%

Attention-Inspired Artificial Neural Networks for Speech Processing: A Systematic Review

et al. 2021

View full text Add to dashboard Cite

Artificial Neural Networks (ANNs) were created inspired by the neural networks in the human brain and have been widely applied in speech processing. The application areas of ANN include: Speech recognition, speech emotion recognition, language identification, speech enhancement, and speech separation, amongst others. Likewise, given that speech processing performed by humans involves complex cognitive processes known as auditory attention, there has been a growing amount of papers proposing ANNs supported by deep learning algorithms in conjunction with some mechanism to achieve symmetry with the human attention process. However, while these ANN approaches include attention, there is no categorization of attention integrated into the deep learning algorithms and their relation with human auditory attention. Therefore, we consider it necessary to have a review of the different ANN approaches inspired in attention to show both academic and industry experts the available models for a wide variety of applications. Based on the PRISMA methodology, we present a systematic review of the literature published since 2000, in which deep learning algorithms are applied to diverse problems related to speech processing. In this paper 133 research works are selected and the following aspects are described: (i) Most relevant features, (ii) ways in which attention has been implemented, (iii) their hypothetical relationship with human attention, and (iv) the evaluation metrics used. Additionally, the four publications most related with human attention were analyzed and their strengths and weaknesses were determined.

show abstract

“…The self-attention mechanism is essentially a special case of the attention model. The unified attention model contains three types of inputs: key, value, and query [42], as depicted in Figure 4. The key and the value are a pair of data representations.…”

Section: Rcsa Mechanismmentioning

confidence: 99%

RCSANet: A Full Convolutional Network for Extracting Inland Aquaculture Ponds from High-Spatial-Resolution Images

et al. 2020

View full text Add to dashboard Cite

Numerous aquaculture ponds are intensively distributed around inland natural lakes and mixed with cropland, especially in areas with high population density in Asia. Information about the distribution of aquaculture ponds is essential for monitoring the impact of human activities on inland lakes. Accurate and efficient mapping of inland aquaculture ponds using high-spatial-resolution remote-sensing images is a challenging task because aquaculture ponds are mingled with other land cover types. Considering that aquaculture ponds have intertwining regular embankments and that these salient features are prominent at different scales, a Row-wise and Column-wise Self-Attention (RCSA) mechanism that adaptively exploits the identical directional dependency among pixels is proposed. Then a fully convolutional network (FCN) combined with the RCSA mechanism (RCSANet) is proposed for large-scale extraction of aquaculture ponds from high-spatial-resolution remote-sensing imagery. In addition, a fusion strategy is implemented using a water index and the RCSANet prediction to further improve extraction quality. Experiments on high-spatial-resolution images using pansharpened multispectral and 2 m panchromatic images show that the proposed methods gain at least 2–4% overall accuracy over other state-of-the-art methods regardless of regions and achieve an overall accuracy of 85% at Lake Hong region and 83% at Lake Liangzi region in aquaculture pond extraction.

show abstract

Attention in Natural Language Processing

Cited by 414 publications

References 137 publications

Classification of Remote Sensing Images Using EfficientNet-B3 CNN Model With Attention

Classification of Remote Sensing Images Using EfficientNet-B3 CNN Model With Attention

Attention-Inspired Artificial Neural Networks for Speech Processing: A Systematic Review

RCSANet: A Full Convolutional Network for Extracting Inland Aquaculture Ponds from High-Spatial-Resolution Images

Contact Info

Product

Resources

About