Learning Sentence Representation with Guidance of Human Attention

Wang, Shaonan; Zhang, Jiajun; Zong, Chengqing

doi:10.24963/ijcai.2017/578

Cited by 44 publications

(22 citation statements)

References 8 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…On a related note, Raudonis et al (2013) developed a emotion recognition system from visual stimulus (not text) and showed that features such as pupil size and motion speed are relevant to accurately detect emotions from eye-tracking data. Wang et al (2017) use variables shown to correlate with human attention, e.g. surprisal, to guide the attention for sentence representations.…”

Section: Discussion and Related Workmentioning

confidence: 99%

Sequence Classification with Human Attention

Barrett¹,

Bingel²,

Hollenstein³

et al. 2018

Proceedings of the 22nd Conference on Computational Natural Language Learning

View full text Add to dashboard Cite

Learning attention functions requires large volumes of data, but many NLP tasks simulate human behavior, and in this paper, we show that human attention really does provide a good inductive bias on many attention functions in NLP. Specifically, we use estimated human attention derived from eyetracking corpora to regularize attention functions in recurrent neural networks. We show substantial improvements across a range of tasks, including sentiment analysis, grammatical error detection, and detection of abusive language.

show abstract

Section: Discussion and Related Workmentioning

confidence: 99%

Sequence Classification with Human Attention

Barrett¹,

Bingel²,

Hollenstein³

et al. 2018

Proceedings of the 22nd Conference on Computational Natural Language Learning

View full text Add to dashboard Cite

show abstract

“…Recently, owing to the success of word embeddings [Bengio et al, 2003;Mikolov et al, 2013], researchers have attempted to study sentence similarity modeling via sentence embeddings. This approach has become a successful paradigm in natural language processing (NLP) community [Kenter et al, 2016;Wang et al, 2017]; and particularly some studies have used the attention weight mechanism to further enhance the performance [Wang et al, 2017;Arora et al, 2017]. In this line of works, most previous studies focused on learning semantic information and modeling it as a continuous vector, while the syntactic information of sentences are not fully exploited.…”

Section: Introductionmentioning

confidence: 99%

“…So far, extensive studies have proven that word attributes, as represented by frequency, POS tag, length, Surprisal, etc., are all correlated with human reading time [Barrett et al, 2016]. Thereby, researchers have considered to assign words with different weights (known as attention weight mechanism), and there are many schemes to assign attention weights to words, such as smooth inverse frequency (SIF), term frequency-inverse document frequency (TF-IDF), Surprisal (SUR), POS tag (POS), CCG supertag (CCG) [Wang et al, 2017;Arora et al, 2017]. Tree kernel: Tree kernel is used to compute the similarity between structured trees.…”

Section: Introductionmentioning

confidence: 99%

ACV-tree: A New Method for Sentence Similarity Modeling

Wang

Quan

et al. 2018

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence

View full text Add to dashboard Cite

Sentence similarity modeling lies at the core of many natural language processing applications, and thus has received much attention. Owing to the success of word embeddings, recently, popular neural network methods have achieved sentence embedding, obtaining attractive performance. Nevertheless, most of them focused on learning semantic information and modeling it as a continuous vector, while the syntactic information of sentences has not been fully exploited. On the other hand, prior works have shown the benefits of structured trees that include syntactic information, while few methods in this branch utilized the advantages of word embeddings and another powerful technique -attention weight mechanism. This paper makes the first attempt to absorb their advantages by merging these techniques in a unified structure, dubbed as ACV-tree. Meanwhile, this paper develops a new tree kernel, known as ACVT kernel, that is tailored for sentence similarity measure based on the proposed structure. The experimental results, based on 19 widely-used datasets, demonstrate that our model is effective and competitive, compared against state-of-the-art models.

show abstract

“…They explore attention models over single sentences with guidance of human attention. In computer vision area, the core concept of attention models is to focus on the important parts of the input image, instead of giving all pixels the same weight [34]. Inspired by the theory of visual attention mechanism, we propose a sampling strategy about eye fixations based on human visual system.…”

Section: Related Workmentioning

confidence: 99%

“…Many attention models have been proposed in both natural language processing and computer vision. In [34], Wang et al have proven that human read sentences by making a sequence of fixations and saccades. They explore attention models over single sentences with guidance of human attention.…”

Section: Related Workmentioning

confidence: 99%

Image Classification Based on Convolutional Denoising Sparse Autoencoder

Chen

Liu

Zeng

et al. 2017

Mathematical Problems in Engineering

View full text Add to dashboard Cite

Image classification aims to group images into corresponding semantic categories. Due to the difficulties of interclass similarity and intraclass variability, it is a challenging issue in computer vision. In this paper, an unsupervised feature learning approach called convolutional denoising sparse autoencoder (CDSAE) is proposed based on the theory of visual attention mechanism and deep learning methods. Firstly, saliency detection method is utilized to get training samples for unsupervised feature learning. Next, these samples are sent to the denoising sparse autoencoder (DSAE), followed by convolutional layer and local contrast normalization layer. Generally, prior in a specific task is helpful for the task solution. Therefore, a new pooling strategy-spatial pyramid pooling (SPP) fused with center-bias prior-is introduced into our approach. Experimental results on the common two image datasets (STL-10 and CIFAR-10) demonstrate that our approach is effective in image classification. They also demonstrate that none of these three components: local contrast normalization, SPP fused with center-prior, and 2 vector normalization can be excluded from our proposed approach. They jointly improve image representation and classification performance.

show abstract

Learning Sentence Representation with Guidance of Human Attention

Cited by 44 publications

References 8 publications

Sequence Classification with Human Attention

Sequence Classification with Human Attention

ACV-tree: A New Method for Sentence Similarity Modeling

Image Classification Based on Convolutional Denoising Sparse Autoencoder

Contact Info

Product

Resources

About