Efficiently Summarizing Text and Graph Encodings of Multi-Document Clusters

Pasunuru, Ramakanth; Liu, Mengwen; Bansal, Mohit; Ravi, Srivatsan; Dreyer, Markus

doi:10.18653/v1/2021.naacl-main.380

Cited by 35 publications

(35 citation statements)

References 18 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…512/256 and 1024/1024 respectively) on all of the datasets (Except for the zero-shot experiments, the details can be found in Sec.3.3). 10 We use the same length limit as our 9 pilot experiments show that simple truncation results in inferior performance, which is also in line with Pasunuru et al (2021) 10 Regarding the length limit of inputs for PEGASUS, we do experiments with 512, 1024, 4096 on Multi-News dataset, and the model with length limit 512 achieves the best performance, model for the LED model, i.e. 4096/1024 for input and output respectively, for all the datasets.…”

Section: Zero-and Few-shot Evaluationmentioning

confidence: 54%

“…PEGASUS (Zhang et al, 2020) a pre-trained model designed for abstractive summarization as downstream task, especially for the single document input. It is trained on the objective of Gap Similar to Pasunuru et al (2021), the inputs of all the models are the concatenations of the documents within the clusters (in the same order), each document is truncated based on the input length limit divided by total number of documents so that all documents are represented in the input 9 . To preserve the same format as the corresponding pre-trained models we set the inference length limit of input and output for BART and PEGASUS exactly as their pre-trained settings (i.e.…”

Section: Zero-and Few-shot Evaluationmentioning

confidence: 99%

“…Neural Multi-Document Summarization Neural multi-document summarizerization models can be roughly categorized into two classes, graphbased models (Liao et al, 2018;Pasunuru et al, 2021) and hierarchical models (Liu and Lapata, 2019a;Fabbri et al, 2019;Jin et al, 2020). The graph-based models typically leverage additional information, e.g.…”

Section: Related Workmentioning

confidence: 99%

“…State-of-the-art approaches to multi-document summarization are mostly graph-based (Liao et al, 2018;Pasunuru et al, 2021), leveraging graph neural networks to connect information between the documents, or hierarchical (Liu and Lapata, 2019a;Fabbri et al, 2019;Jin et al, Figure 1: Prior work on pre-training mostly focuses on single document representation and does not target a specific downstream task. In contrast, PRIMER is a pre-trained model for multi-document processing with focus on summarization.…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

PRIMERA: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization

Xu¹,

Beltagy²,

Carenini³

et al. 2021

Preprint

View full text Add to dashboard Cite

Recently proposed pre-trained generation models achieve strong performance on singledocument summarization benchmarks. However, most of them are pre-trained with general-purpose objectives and mainly aim to process single document inputs. In this paper, we propose PRIMER, a pre-trained model for multi-document representation with focus on summarization that reduces the need for dataset-specific architectures and large amounts of fine-tuning labeled data. Specifically, we adopt the Longformer architecture with proper input transformation and global attention to fit for multi-document inputs, and we use Gap Sentence Generation objective with a new strategy to select salient sentences for the whole cluster, called Entity Pyramid, to teach the model to select and aggregate information across a cluster of related documents. With extensive experiments on 6 multidocument summarization datasets from 3 different domains on the zero-shot, few-shot and full-supervised settings, our model, PRIMER, outperforms current state-of-the-art models on most of these settings with large margins. 1 * Work mainly done during internship at AI2. 1 The code and pretrained models will be released at https://github.com/allenai/PRIMER

show abstract

Section: Zero-and Few-shot Evaluationmentioning

confidence: 54%

Section: Zero-and Few-shot Evaluationmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

PRIMERA: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization

Xu¹,

Beltagy²,

Carenini³

et al. 2021

Preprint

View full text Add to dashboard Cite

show abstract

“…In order to adapt to long-form text and alleviate quadratic complexity of full-attention operation, sparse attention is applied in the self-attention module for long-form input. Rather than attending to all other tokens, every token only attends to specific tokens with strategies such as window attention [130,142,209], global attention [142,209], random attention [209] and Sinkhorn attention [228].…”

Section: Improved Attention Modulesmentioning

confidence: 99%

Pretrained Language Models for Text Generation: A Survey

Li¹,

Tang²,

Zhao³

et al. 2022

Preprint

View full text Add to dashboard Cite

Text Generation aims to produce plausible and readable text in human language from input data. The resurgence of deep learning has greatly advanced this field by neural generation models, especially the paradigm of pretrained language models (PLMs). Grounding text generation on PLMs is seen as a promising direction in both academia and industry. In this survey, we present the recent advances achieved in the topic of PLMs for text generation. In detail, we begin with introducing three key points of applying PLMs to text generation: 1) how to encode the input data as representations preserving input semantics which can be fused into PLMs; 2) how to design a universal and performant architecture of PLMs served as generation models; and 3) how to optimize PLMs given the reference text and ensure the generated text satisfying special text properties. Then, we figure out several challenges and future directions within each key point. Next, we present a summary of various useful resources and typical text generation applications to work with PLMs. Finally, we conclude and summarize the contribution of this survey.CCS Concepts: • Computing methodologies → Natural language generation.

show abstract

HierMDS: a hierarchical multi-document summarization model with global–local document dependencies

Jin

2023

Neural Comput & Applic

View full text Add to dashboard Cite

Efficiently Summarizing Text and Graph Encodings of Multi-Document Clusters

Cited by 35 publications

References 18 publications

PRIMERA: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization

PRIMERA: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization

Pretrained Language Models for Text Generation: A Survey

HierMDS: a hierarchical multi-document summarization model with global–local document dependencies

Contact Info

Product

Resources

About