Revisiting the Centroid-based Method: A Strong Baseline for
            Multi-Document Summarization

Ghalandari, Demian Gholipour

doi:10.18653/v1/w17-4511

Cited by 16 publications

(7 citation statements)

References 7 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…If the cosine similarity is larger than the pre-defined threshold δ (see Section 5.4), the corresponding cluster is considered as a candidate for the date. Finally, we apply CENTROID-OPT (Ghalandari 2017) as a sentence ranking algorithm within a cluster and summarize each date individually by selecting one sentence per cluster with the highest ranking score.…”

Section: Timeline Summary Extractormentioning

confidence: 99%

Joint Learning-based Heterogeneous Graph Attention Network for Timeline Summarization

You

Kamigaito

et al. 2023

Journal of Natural Language Processing

View full text Add to dashboard Cite

Timeline summarization (TLS) is defined as a task for summarizing events in chronological order, which gives readers a comprehensive understanding of an evolutionary story. Previous studies on the timeline summarization (TLS) task ignored the information interaction between sentences and dates, and adopted pre-defined unlearnable representations for them, which significantly degrade the performance. They also considered date selection and event detection as two independent tasks, which makes it impossible to integrate their advantages and obtain a globally optimal summary. In this paper, we present a joint learning-based heterogeneous graph attention network for TLS (HeterTls), in which date selection and event detection are combined into a unified framework to improve the extraction accuracy and remove redundant sentences simultaneously. Our heterogeneous graph involves multiple types of nodes, the representations of which are iteratively learned across the heterogeneous graph attention layer. We evaluated our model on four datasets, and found that it significantly outperformed the current state-of-the-art baselines with regard to ROUGE scores and date selection metrics.

show abstract

Section: Timeline Summary Extractormentioning

confidence: 99%

Joint Learning-based Heterogeneous Graph Attention Network for Timeline Summarization

You

Kamigaito

et al. 2023

Journal of Natural Language Processing

View full text Add to dashboard Cite

show abstract

“…Unsupervised Opinion Summarization Extractive summarization consists in selecting a few sentences from the input documents to form the output summary. The centroid method Rossiello et al, 2017;Gholipour Ghalandari, 2017) consists in ranking sentences according to their relevance to the whole input. Graph-based methods, such as LexRank (Erkan and Radev, 2004) or TextRank (Mihalcea and Tarau, 2004;Zheng and Lapata, 2019), use the PageRank algorithm to find the most central sentences in a graph of input sentences, where edge weights indicate word overlap.…”

Section: Related Workmentioning

confidence: 99%

Self-Supervised and Controlled Multi-Document Opinion Summarization

Elsahar¹,

Coavoux²,

Rozen³

et al. 2021

Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume

View full text Add to dashboard Cite

We address the problem of unsupervised abstractive summarization of collections of user generated reviews through self-supervision and control. We propose a self-supervised setup that considers an individual document as a target summary for a set of similar documents. This setting makes training simpler than previous approaches by relying only on standard log-likelihood loss and mainstream models. We address the problem of hallucinations through the use of control codes, to steer the generation towards more coherent and relevant summaries. Our benchmarks on two English datasets against graph-based and recent neural abstractive unsupervised models show that our proposed method generates summaries with a superior quality and relevance, as well as a high sentiment and topic alignment with the input reviews. This is confirmed in our human evaluation which focuses explicitly on the faithfulness of generated summaries. We also provide an ablation study showing the importance of the control setup in controlling hallucinations.

show abstract

“…Unsupervised extractive summarization methods consists in selecting the most salient sentences from a text. Saliency can be quantified with the centroid method Gholipour Ghalandari, 2017;Rossiello et al, 2017), which consists in computing vector representations for sentences and selecting which sentences are the closest to their centroid, and thus the most representative of the set. Other proposals make use of the PageRank algorithm (Mihalcea and Tarau, 2004;Erkan and Radev, 2004) to compute sentence saliency.…”

Section: Related Workmentioning

confidence: 99%

Unsupervised Aspect-Based Multi-Document Abstractive Summarization

Coavoux¹,

Elsahar²,

Gallé³

2019

Proceedings of the 2nd Workshop on New Frontiers in Summarization

View full text Add to dashboard Cite

User-generated reviews of products or services provide valuable information to customers. However, it is often impossible to read each of the potentially thousands of reviews: it would therefore save valuable time to provide short summaries of their contents. We address opinion summarization, a multi-document summarization task, with an unsupervised abstractive summarization neural system. Our system is based on (i) a language model that is meant to encode reviews to a vector space, and to generate fluent sentences from the same vector space (ii) a clustering step that groups together reviews about the same aspects and allows the system to generate summary sentences focused on these aspects. Our experiments on the Oposum dataset empirically show the importance of the clustering step.

show abstract

Revisiting the Centroid-based Method: A Strong Baseline for Multi-Document Summarization

Cited by 16 publications

References 7 publications

Joint Learning-based Heterogeneous Graph Attention Network for Timeline Summarization

Joint Learning-based Heterogeneous Graph Attention Network for Timeline Summarization

Self-Supervised and Controlled Multi-Document Opinion Summarization

Unsupervised Aspect-Based Multi-Document Abstractive Summarization

Contact Info

Product

Resources

About