Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) 2020
DOI: 10.18653/v1/2020.emnlp-main.33
|View full text |Cite
|
Sign up to set email alerts
|

What Have We Achieved on Text Summarization?

Abstract: Deep learning has led to significant improvement in text summarization with various methods investigated and improved ROUGE scores reported over the years. However, gaps still exist between summaries produced by automatic summarizers and human professionals. Aiming to gain more understanding of summarization systems with respect to their strengths and limits on a fine-grained syntactic and semantic level, we consult the Multidimensional Quality Metric 1 (MQM) and quantify 8 major sources of errors on 10 repres… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
27
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 45 publications
(35 citation statements)
references
References 40 publications
(80 reference statements)
0
27
0
Order By: Relevance
“…Modern text generation models are known to hallucinate facts (Huang et al, 2020), which has led the community to create models to detect and correct hallucinations (Cao et al, 2020;.…”
Section: Inaccuracy Guardrailmentioning
confidence: 99%
“…Modern text generation models are known to hallucinate facts (Huang et al, 2020), which has led the community to create models to detect and correct hallucinations (Cao et al, 2020;.…”
Section: Inaccuracy Guardrailmentioning
confidence: 99%
“…notation task followed Huang et al (2020) and consisted of relevance, consistency, fluency, and coherency.…”
Section: Human Evaluationmentioning
confidence: 99%
“…In this paper, we focus on coherent paragraph summarization datasets. Automatic evaluation of summarization systems, e.g., by using the ROUGE metric, is challenging (Lloret et al, 2018) and is often inconsistent with human evaluation (Liu and Liu, 2008;Cohan and Goharian, 2016;Tay et al, 2019;Huang et al, 2020). To understand -and later improve -the quality of summarization systems, it is necessary to conduct a human evaluation.…”
Section: Introductionmentioning
confidence: 99%
“…The authors of [11] also compared human responses with automatic evaluation metrics used in text summarization research [12]. Huang et al [19] defined eight errors including missing key points or unnecessary repetition and asked users to manually select errors in computer generated summaries to investigate the limitations of prevailing automatic summarization methods. In these studies, the CNN/DM dataset [20], which consists of news articles, is used for manual evaluation of summaries.…”
Section: Related Workmentioning
confidence: 99%
“…We employed an unsupervised algorithm, called TextRank [27], to compute sentence importance based on sentence connectivity and generate summaries that consist of important sentences only. TextRank is still a strong baseline which shows competitive performance to a recent summarization method [19]. Note that we exclude the "abstract" and the "reference" sections in the original papers and use only the paper bodies as inputs.…”
Section: Reading Materialsmentioning
confidence: 99%