Abstractive Sentence Summarization with Attentive Recurrent Neural Networks

Chopra, Sumit; Auli, Michael; Rushton, Gérard

doi:10.18653/v1/n16-1012

Cited by 779 publications

(625 citation statements)

References 15 publications

(13 reference statements)

Supporting

Mentioning

593

Contrasting

Unclassified

Order By: Relevance

“…Using OpenNMT, we were able to replicate the sentence summarization results of Chopra et al (2016), reaching a ROUGE-1 score of 33.13 on the Gigaword data. We have also trained a model on 14 million sentences of the OpenSubtitles data set based on the work Vinyals and Le (2015), achieving comparable perplexity.…”

Section: Benchmarksmentioning

confidence: 97%

Proceedings of ACL 2017, System Demonstrations

Zhang

2017

View full text Add to dashboard Cite

We present the first open-source tool for annotating morphosyntactic tense, mood and voice for English, French and German verbal complexes. The annotation is based on a set of language-specific rules, which are applied on dependency trees and leverage information about lemmas, morphological properties and POS-tags of the verbs. Our tool has an average accuracy of about 76%. The tense, mood and voice features are useful both as features in computational modeling and for corpuslinguistic research. IntroductionNatural language employs, among other devices such as temporal adverbials, tense and aspect to locate situations in time and to describe their temporal structure (Deo, 2012). The tool presented here addresses the automatic annotation of morphosyntactic tense, i.e., the tense-aspect combinations, expressed in the morphology and syntax of verbal complexes (VC). VCs are sequences of verbal tokens within a verbal phrase. We address German, French and English, in which the morphology and syntax also includes information on mood and voice. Morphosyntactic tenses do not always correspond to semantic tense (Deo, 2012). For example, the morphosyntactic tense of the English sentence "He is leaving at noon." is present progressive, while the semantic tense is future. In the remainder of this paper, we use the term tense to refer to the morphological tense and aspect information encoded in finite verbal complexes.Corpus-linguistic research, as well as automatic modeling of mono-and cross-lingual use of tense, mood and voice will strongly profit from a reliable automatic method for identifying these clausal features. They may, for instance, be used to classify texts with respect to the epoch or region in which they have been produced, or for assigning texts to a specific author. Moreover, in crosslingual research, tense, mood, and voice have been used to model the translation of tense between different language pairs (Santos, 2004; Loáiciga et al., 2014; Ramm and Fraser, 2016)). Identifying the morphosyntactic tense is also a necessary prerequisite for identifying the semantic tense in synthetic languages such as English, French or German (Reichart and Rappoport, 2010). The extracted tense-mood-voice (TMV) features may also be useful for training models in computational linguistics, e.g., for modeling of temporal relations (Costa and Branco, 2012; UzZaman et al., 2013).As illustrated by the examples in Figure 1, relevant information for determining TMV is given by syntactic dependencies and partially by partof-speech (POS) tags output by analyzers such as Mate (Bohnet and Nivre, 2012). However, the parser's output is not sufficient for determining TMV features; morphological features and lexical information needs to be taken into account as well. Learning TMV features from an annotated corpus would be an alternative; however, to the best of our knowledge, no such large-scale corpora exist.A sentence may contain more than one VC, and the tokens belonging to a VC are not always contiguous in the sentence (see VCs A...

show abstract

Section: Benchmarksmentioning

confidence: 97%

Proceedings of ACL 2017, System Demonstrations

Zhang

2017

View full text Add to dashboard Cite

show abstract

“…Our baseline model is a strong, multi-layered encoder-attention-decoder model with bilinear attention, similar to Luong et al (2015) and following the details in Chopra et al (2016). Here, we encode the source document with a two-layered LSTM-RNN and generate the summary using another two-layered LSTM-RNN decoder.…”

Section: Baseline Modelmentioning

confidence: 99%

Towards Improving Abstractive Summarization via Entailment Generation

Pasunuru

Guo

Bansal

2017

Proceedings of the Workshop on New Frontiers in Summarization

View full text Add to dashboard Cite

Abstractive summarization, the task of rewriting and compressing a document into a short summary, has achieved considerable success with neural sequence-tosequence models. However, these models can still benefit from stronger natural language inference skills, since a correct summary is logically entailed by the input document, i.e., it should not contain any contradictory or unrelated information. We incorporate such knowledge into an abstractive summarization model via multi-task learning, where we share its decoder parameters with those of an entailment generation model. We achieve promising initial improvements based on multiple metrics and datasets (including a test-only setting). The domain mismatch between the entailment (captions) and summarization (news) datasets suggests that the model is learning some domain-agnostic inference skills.

show abstract

“…The pure data-driven end-to-end automatic summarization generation method was originally borrowed from the neural network model of machine translation [3,4]. K Lopvrev et al built an abstract generation model based on the encoder-decoder framework in 2015 by using RNN(Recurrent Neural Network) with unit of LSTM(Long Short-Term Memory) [5] and used an attention mechanism to generate news headlines [6].Secondly, the two papers [7,8] published by Rush et al from the Facebook Artificial Intelligence Research Institute from 2015 to 2016 to solve the text abstract generation task, based on the Encoder-Decoder architecture, proposed different encoder approaches based CNN(Convolutional Neural Network) and attention mechanisms, and decoder architecture based on the RNNLM(Recurrent Neural Network Language Model). Hu et al [9] applied RNN-based Encoder-Decoder architecture to Chinese text digest tasks and constructed a Chinese text digest dataset LCSTS to facilitate the study of Chinese comprehension abstracts.…”

Section: Introductionmentioning

confidence: 99%

Chinese Short Text Summary Generation Model Combining Global and Local Information

Chen¹

2018

Proceedings of the 2018 International Conference on Network, Communication, Computer Engineering (NCCE 2018)

View full text Add to dashboard Cite

Abstract. Short text comprehension summary generation is currently a hot issue. In this paper, we improve the attention mechanism under the framework of encoder-decoder and proposes a comprehensible short text abstract generation model that integrates the global and local semantic information. The model consists of a dual encoder and a decoder. The dual encoder structure can combine the global and local semantic information and fully obtain the abstract features of the original text. And the improved mechanism can adaptively combine all information of short text to provide the input with summary characteristics for the decoder, so that the decoder can more accurately focus on the core content of the source text. In this paper, LCSTS dataset is used to train and test the model. The experimental results show that compared with the Seq2Seq and Seq2Seq with standard attention models, the proposed method can produce high-quality summary which consists of less repetitive words and performs better evaluation value in ROUGE.

show abstract

Abstractive Sentence Summarization with Attentive Recurrent Neural Networks

Cited by 779 publications

References 15 publications

Proceedings of ACL 2017, System Demonstrations

Proceedings of ACL 2017, System Demonstrations

Towards Improving Abstractive Summarization via Entailment Generation

Chinese Short Text Summary Generation Model Combining Global and Local Information

Contact Info

Product

Resources

About