Jointly Extracting and Compressing Documents with Summary State Representations

Mendes, Afonso; Narayan, Shashi; Miranda, Sebastião; Marinho, Zita; Martins, André F. T.; Cohen, Shay B.

doi:10.18653/v1/n19-1397

Cited by 45 publications

(55 citation statements)

References 29 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…All prior results reported on the arXiv and Pubmed benchmark are obtained from Cohan et al (2018), except for the Bottom-up model 2 (Gehrmann et al, 2018). Similarly, prior results for the BigPatent dataset are obtained from (Sharma et al, 2019) and Newsroom from (Grusky et al, 2018a) and (Mendes et al, 2019). These methods include LexRank (Erkan and Radev, 2004), SumBasic ( Vanderwende et al, 2007), LSA (Steinberger and Jezek, 2004), Attention-Seq2Seq (Nallapati et al, 2016a;Chopra et al, 2016), Pointer-Generator Seq2Seq (See et al, 2017), Discourse-aware, which is a hierarchical extension to the pointer generator model, (Cohan et al, 2018), Sent-rewriting (Chen and Bansal, 2018), RNN-Ext (Chen and Bansal, 2018), Exconsumm (Mendes et al, 2019).…”

Section: Results and Analysismentioning

confidence: 99%

On Extractive and Abstractive Neural Document Summarization with Transformer Language Models

Pilault¹,

Li²,

Subramanian³

et al. 2020

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

113

View full text Add to dashboard Cite

We present a method to produce abstractive summaries of long documents that exceed several thousand words via neural abstractive summarization. We perform a simple extractive step before generating a summary, which is then used to condition the transformer language model on relevant information before being tasked with generating a summary. We also show that this approach produces more abstractive summaries compared to prior work that employs a copy mechanism while still achieving higher ROUGE scores. We provide extensive comparisons with strong baseline methods, prior state of the art work as well as multiple variants of our approach including those using only transformers, only extractive techniques and combinations of the two. We examine these models using four different summarization tasks and datasets: arXiv papers, PubMed papers, the Newsroom and BigPatent datasets. We find that transformer based methods produce summaries with fewer n-gram copies, leading to n-gram copying statistics that are more similar to human generated abstracts. We include a human evaluation, finding that transformers are ranked highly for coherence and fluency, but purely extractive methods score higher for informativeness and relevance. We hope that these architectures and experiments may serve as strong points of comparison for future work. 1

show abstract

Section: Results and Analysismentioning

confidence: 99%

On Extractive and Abstractive Neural Document Summarization with Transformer Language Models

Pilault¹,

Li²,

Subramanian³

et al. 2020

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

113

View full text Add to dashboard Cite

show abstract

“…Ultilizing representations of partially generated summaries is relatively less studied in summarization. Mendes et al (2019) proposed to dynamically model the generated summary using an LSTM to iteratively increment summaries based on previously extracted information. used a feedforward neural network driven by hand-curated features capturing the prevalence of domain subtopics in the source and the summary.…”

Section: Related Workmentioning

confidence: 99%

“…For example, we don't pretrain HiBERT from scratch for document modeling as in . Instead, we initialize our HiBERT models with publicly available RoBERTa checkpoints following the superior performance of (Zhang et al, 2018) 41.05 18.77 37.54 Refresh (Narayan et al, 2018b) 41.00 18.80 37.70 BanditSum (Dong et al, 2018) 41.50 18.70 37.60 NeuSUM (Zhou et al, 2018) 41.59 19.01 37.98 ExConSum (Mendes et al, 2019) 41.70 18.60 37.80 JECS (Xu and Durrett, 2019) 41.70 18.50 37.90 LSTM+PN (Zhong et al, 2019b) 41.85 18.93 38.13 HER (Luo et al, 2019) 42.30 18.90 37.60 HiBERT 42.37 19.95 38.83 PNBERT (Zhong et al, 2019a) 42.69 19.60 38.85 BERTSum (Liu and Lapata, 2019b) 42 RoBERTaSum over BERTSum. We use different number of layers in the document encoder (L doc = 3) and in the sentence encoder (L sent = 9), as opposed to equal number of layers (L = 6) in both encoders of .…”

Section: Stepwise Etcsummentioning

confidence: 99%

“…However, these are but a small subset of the dimensions necessary to produce informative and coherent summaries. Ideally, models would utilize enriched document and summary representations in order to implicitly learn better extractive plans for producing summaries Mendes et al, 2019). One such method is stepwise summarization , where a summary is constructed incrementally by choosing new content conditioned on previously planned content.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Stepwise Extractive Summarization and Planning with Structured Transformers

Narayan¹,

Maynez²,

Adámek³

et al. 2020

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Self Cite

View full text Add to dashboard Cite

We propose encoder-centric stepwise models for extractive summarization using structured transformers - HiBERT (Zhang et al., 2019) and Extended Transformers (Ainslie et al., 2020). We enable stepwise summarization by injecting the previously generated summary into the structured transformer as an auxiliary sub-structure. Our models are not only efficient in modeling the structure of long inputs, but they also do not rely on task-specific redundancy-aware modeling, making them a general purpose extractive content planner for different tasks. When evaluated on CNN/DailyMail extractive summarization, stepwise models achieve state-of-the-art performance in terms of Rouge without any redundancy aware modeling or sentence filtering. This also holds true for Rotowire tableto-text generation, where our models surpass previously reported metrics for content selection, planning and ordering, highlighting the strength of stepwise modeling. Amongst the two structured transformers we test, stepwise Extended Transformers provides the best performance across both datasets and sets a new standard for these challenges. 1

show abstract

“…Some effort has been made to combine these two branches. Most existing works use the extract-then-abstract framework that first extracts the summary-worthy sentences and then abstracts each of them (Dong et al, 2018;Mendes et al, 2019;Chen and Bansal, 2018). However, they suffer from an information loss in abstract stage, since all the sentence is compressed and pruned without a distinguish.…”

Section: Rewritementioning

confidence: 99%

Copy or Rewrite: Hybrid Summarization with Hierarchical Reinforcement Learning

Xiao

Wang

et al. 2020

AAAI

View full text Add to dashboard Cite

Jointly using the extractive and abstractive summarization methods can combine their complementary advantages, generating both informative and concise summary. Existing methods that adopt an extract-then-abstract strategy have achieved impressive results, yet they suffer from the information loss in the abstraction step because they compress all the selected sentences without distinguish. Especially when the whole sentence is summary-worthy, salient content would be lost by compression. To address this problem, we propose HySum, a hybrid framework for summarization that can flexibly switch between copying sentence and rewriting sentence according to the degree of redundancy. In this way, our approach can effectively combine the advantages of two branches of summarization, juggling informativity and conciseness. Moreover, we based on Hierarchical Reinforcement Learning, propose an end-to-end reinforcing method to bridge together the extraction module and rewriting module, which can enhance the cooperation between them. Automatic evaluation shows that our approach significantly outperforms the state-of-the-arts on the CNN/DailyMail corpus. Human evaluation also demonstrates that our generated summaries are more informative and concise than popular models.

show abstract

Jointly Extracting and Compressing Documents with Summary State Representations

Cited by 45 publications

References 29 publications

On Extractive and Abstractive Neural Document Summarization with Transformer Language Models

On Extractive and Abstractive Neural Document Summarization with Transformer Language Models

Stepwise Extractive Summarization and Planning with Structured Transformers

Copy or Rewrite: Hybrid Summarization with Hierarchical Reinforcement Learning

Contact Info

Product

Resources

About