Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conferen 2019
DOI: 10.18653/v1/d19-1620
|View full text |Cite
|
Sign up to set email alerts
|

Countering the Effects of Lead Bias in News Summarization via Multi-Stage Training and Auxiliary Losses

Abstract: Sentence position is a strong feature for news summarization, since the lead often (but not always) summarizes the key points of the article. In this paper, we show that recent neural systems excessively exploit this trend, which although powerful for many inputs, is also detrimental when summarizing documents where important content should be extracted from later parts of the article. We propose two techniques to make systems sensitive to the importance of content in different parts of the article. The first … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
31
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
3
2
1
1

Relationship

1
6

Authors

Journals

citations
Cited by 31 publications
(32 citation statements)
references
References 21 publications
0
31
0
Order By: Relevance
“…Furthermore, abstracts are almost always available rather than behind paywalls like full text articles. For news summarisation, we used a state-of-the-art extractive model (Grenander et al, 2019) to extract sentences forming a summary of the original text. This model provides a summary de-biasing mechanism preventing it from focusing on specific parts of the full article, preserving the summary's informational authenticity as much as possible.…”
Section: Article Summarisationmentioning
confidence: 99%
“…Furthermore, abstracts are almost always available rather than behind paywalls like full text articles. For news summarisation, we used a state-of-the-art extractive model (Grenander et al, 2019) to extract sentences forming a summary of the original text. This model provides a summary de-biasing mechanism preventing it from focusing on specific parts of the full article, preserving the summary's informational authenticity as much as possible.…”
Section: Article Summarisationmentioning
confidence: 99%
“…The high level of abstractiveness makes our dataset challenging since models cannot simply copy sentences from the reference articles. (Grenander et al, 2019). The extractive oracle performance indicates the level of "extractiveness" of each dataset.…”
Section: Dataset Creationmentioning
confidence: 99%
“…High positional and extractive biases can undesirably enable models to achieve high summarization scores by copying sentences from certain (fixed) positions, e.g. lead sentences in news summarization (Grenander et al, 2019;Narayan et al, 2018a). Empirical results show that our dataset is challenging and requires models having high-level of text abstractiveness.…”
Section: Introductionmentioning
confidence: 99%
“…It may be tempting to apply neural abstractive summarization to meetings given its remarkable recent success on summarization benchmarks, e.g., CNN/DM (See et al, 2017;Chen and Bansal, 2018;Gehrmann et al, 2018;Laban et al, 2020). However, the challenge lies not only in handling hallucinations that are seen in abstractive models (Kryscinski et al, 2019;Lebanoff et al, 2019;Maynez et al, 2020) but also the models' strong positional bias that occurs as a consequence of fine-tuning on news articles (Kedzie et al, 2018;Grenander et al, 2019). Neural summarizers also assume a maximum sequence length, e.g., Perez-Beltrachini et al (2019) use the first 800 tokens of the document as input.…”
Section: Introductionmentioning
confidence: 99%
“…The pretraining data contain 160G of news, books, stories, and web text. It remains unclear if the model can effectively identify salient content on spoken text and, how it is to reduce lead bias that is not as frequent in spoken text as in news writing (Grenander et al, 2019). Secondly, a transcript can far exceed the maximum input length of the model, which is restricted by the GPU memory size.…”
Section: Introductionmentioning
confidence: 99%