Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Langua 2021
DOI: 10.18653/v1/2021.naacl-main.380
|View full text |Cite
|
Sign up to set email alerts
|

Efficiently Summarizing Text and Graph Encodings of Multi-Document Clusters

Abstract: This paper presents an efficient graphenhanced approach to multi-document summarization (MDS) with an encoder-decoder Transformer model. This model is based on recent advances in pre-training both encoder and decoder on very large text data , and it incorporates an efficient encoding mechanism (Beltagy et al., 2020) that avoids the quadratic memory growth typical for traditional Transformers. We show that this powerful combination not only scales to large input documents commonly found when summarizing news cl… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
19
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 35 publications
(35 citation statements)
references
References 18 publications
0
19
0
Order By: Relevance
“…512/256 and 1024/1024 respectively) on all of the datasets (Except for the zero-shot experiments, the details can be found in Sec.3.3). 10 We use the same length limit as our 9 pilot experiments show that simple truncation results in inferior performance, which is also in line with Pasunuru et al (2021) 10 Regarding the length limit of inputs for PEGASUS, we do experiments with 512, 1024, 4096 on Multi-News dataset, and the model with length limit 512 achieves the best performance, model for the LED model, i.e. 4096/1024 for input and output respectively, for all the datasets.…”
Section: Zero-and Few-shot Evaluationmentioning
confidence: 54%
See 3 more Smart Citations
“…512/256 and 1024/1024 respectively) on all of the datasets (Except for the zero-shot experiments, the details can be found in Sec.3.3). 10 We use the same length limit as our 9 pilot experiments show that simple truncation results in inferior performance, which is also in line with Pasunuru et al (2021) 10 Regarding the length limit of inputs for PEGASUS, we do experiments with 512, 1024, 4096 on Multi-News dataset, and the model with length limit 512 achieves the best performance, model for the LED model, i.e. 4096/1024 for input and output respectively, for all the datasets.…”
Section: Zero-and Few-shot Evaluationmentioning
confidence: 54%
“…PEGASUS (Zhang et al, 2020) a pre-trained model designed for abstractive summarization as downstream task, especially for the single document input. It is trained on the objective of Gap Similar to Pasunuru et al (2021), the inputs of all the models are the concatenations of the documents within the clusters (in the same order), each document is truncated based on the input length limit divided by total number of documents so that all documents are represented in the input 9 . To preserve the same format as the corresponding pre-trained models we set the inference length limit of input and output for BART and PEGASUS exactly as their pre-trained settings (i.e.…”
Section: Zero-and Few-shot Evaluationmentioning
confidence: 99%
See 2 more Smart Citations
“…In order to adapt to long-form text and alleviate quadratic complexity of full-attention operation, sparse attention is applied in the self-attention module for long-form input. Rather than attending to all other tokens, every token only attends to specific tokens with strategies such as window attention [130,142,209], global attention [142,209], random attention [209] and Sinkhorn attention [228].…”
Section: Improved Attention Modulesmentioning
confidence: 99%