Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2016
DOI: 10.18653/v1/p16-1188
|View full text |Cite
|
Sign up to set email alerts
|

Learning-Based Single-Document Summarization with Compression and Anaphoricity Constraints

Abstract: We present a discriminative model for single-document summarization that integrally combines compression and anaphoricity constraints.Our model selects textual units to include in the summary based on a rich set of sparse features whose weights are learned on a large corpus. We allow for the deletion of content within a sentence when that deletion is licensed by compression rules; in our framework, these are implemented as dependencies between subsentential units of text. Anaphoricity constraints then improve … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
142
0
2

Year Published

2017
2017
2024
2024

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 130 publications
(144 citation statements)
references
References 37 publications
0
142
0
2
Order By: Relevance
“…The LEAD-3 baseline (selecting the first three sentences in each document as the summary) is extremely difficult to beat on CNN/DailyMail (Narayan et al, 2018b,a), which implies that salient information is mostly concentrated in the beginning of a document. NYT writers follow less prescriptive guidelines 2 , and as a result salient information is distributed more evenly in the course of an article (Durrett et al, 2016). We therefore view the NYT annotated corpus (Sandhaus, 2008) as complementary to CNN/DailyMail in terms of evaluating the model's ability of finding salient information.…”
Section: Datasetsmentioning
confidence: 99%
“…The LEAD-3 baseline (selecting the first three sentences in each document as the summary) is extremely difficult to beat on CNN/DailyMail (Narayan et al, 2018b,a), which implies that salient information is mostly concentrated in the beginning of a document. NYT writers follow less prescriptive guidelines 2 , and as a result salient information is distributed more evenly in the course of an article (Durrett et al, 2016). We therefore view the NYT annotated corpus (Sandhaus, 2008) as complementary to CNN/DailyMail in terms of evaluating the model's ability of finding salient information.…”
Section: Datasetsmentioning
confidence: 99%
“…On the CNN/Daily Mail and DUC-2002 dataset, we use standard ROUGE-1, ROUGE-2, and ROUGE- L (Lin, 2004) on full length F 1 with stemming as previous work did (Nallapati et al, 2017;See et al, 2017;Chen and Bansal, 2018). On NYT50 dataset, following Durrett et al (2016) and Paulus et al (2018), we used the limited length ROUGE recall metric, truncating the generated summary to the length of the ground truth summary. Table 1 shows the experimental results on CNN/Daily Mail dataset, with extractive models in the top block and abstractive models in the bottom block.…”
Section: Discussionmentioning
confidence: 99%
“…The New York Times dataset also consists of many news articles. We followed the dataset splits of Durrett et al (2016); 100,834 for training and…”
Section: Datasetsmentioning
confidence: 99%
“…We extracted the first 3 sentences for CNN documents and the first 4 sentences for DailyMail (Narayan et al, 2018b). Following previous work (Durrett, Berg-Kirkpatrick, & Klein, 2016;Paulus et al, 2018), we obtained lead summaries based on the first 100 words for NY Times documents. For Newsroom, we extracted the first 2 sentences to form the lead summaries.…”
Section: How Abstractive Is Xsum?mentioning
confidence: 99%