What Have We Achieved on Text Summarization?

Huang, Dandan; Cui, Leyang; Yang, Shicheng; Bao, Guangsheng; Wang, Kun; Xie, Jun; Zhang, Yue

doi:10.18653/v1/2020.emnlp-main.33

Cited by 45 publications

(35 citation statements)

References 40 publications

(80 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Modern text generation models are known to hallucinate facts (Huang et al, 2020), which has led the community to create models to detect and correct hallucinations (Cao et al, 2020;.…”

Section: Inaccuracy Guardrailmentioning

confidence: 99%

Keep It Simple: Unsupervised Simplification of Multi-Paragraph Text

Laban¹,

Schnabel²,

Bennett³

et al. 2021

Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Confer

View full text Add to dashboard Cite

This work presents Keep it Simple (KiS), a new approach to unsupervised text simplification which learns to balance a reward across three properties: fluency, salience and simplicity. We train the model with a novel algorithm to optimize the reward (k-SCST), in which the model proposes several candidate simplifications, computes each candidate's reward, and encourages candidates that outperform the mean reward. Finally, we propose a realistic text comprehension task as an evaluation method for text simplification. When tested on the English news domain, the KiS model outperforms strong supervised baselines by more than 4 SARI points, and can help people complete a comprehension task an average of 18% faster while retaining accuracy, when compared to the original text.

show abstract

“…Modern text generation models are known to hallucinate facts (Huang et al, 2020), which has led the community to create models to detect and correct hallucinations (Cao et al, 2020;.…”

Section: Inaccuracy Guardrailmentioning

confidence: 99%

Keep It Simple: Unsupervised Simplification of Multi-Paragraph Text

Laban¹,

Schnabel²,

Bennett³

et al. 2021

Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Confer

View full text Add to dashboard Cite

show abstract

“…notation task followed Huang et al (2020) and consisted of relevance, consistency, fluency, and coherency.…”

Section: Human Evaluationmentioning

confidence: 99%

“…In this paper, we focus on coherent paragraph summarization datasets. Automatic evaluation of summarization systems, e.g., by using the ROUGE metric, is challenging (Lloret et al, 2018) and is often inconsistent with human evaluation (Liu and Liu, 2008;Cohan and Goharian, 2016;Tay et al, 2019;Huang et al, 2020). To understand -and later improve -the quality of summarization systems, it is necessary to conduct a human evaluation.…”

Section: Introductionmentioning

confidence: 99%

WikiSum: Coherent Summarization Dataset for Efficient Human-Evaluation

Cohen¹,

Kalinsky²,

Ziser³

et al. 2021

Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Confer

View full text Add to dashboard Cite

Recent works have made significant advances on summarization tasks, facilitated by summarization datasets. Several existing datasets have the form of coherent-paragraph summaries. However, these datasets were curated from academic documents written for experts, making the essential step of assessing the summarization output through human-evaluation very demanding.To overcome these limitations, we present a dataset 1 based on article summaries appearing on the WikiHow website, composed of howto articles and coherent-paragraph summaries written in plain language. We compare our dataset attributes to existing ones, including readability and world-knowledge, showing our dataset makes human evaluation significantly more manageable and effective. A human evaluation conducted on PubMed and the proposed dataset reinforces our findings.

show abstract

“…The authors of [11] also compared human responses with automatic evaluation metrics used in text summarization research [12]. Huang et al [19] defined eight errors including missing key points or unnecessary repetition and asked users to manually select errors in computer generated summaries to investigate the limitations of prevailing automatic summarization methods. In these studies, the CNN/DM dataset [20], which consists of news articles, is used for manual evaluation of summaries.…”

Section: Related Workmentioning

confidence: 99%

“…We employed an unsupervised algorithm, called TextRank [27], to compute sentence importance based on sentence connectivity and generate summaries that consist of important sentences only. TextRank is still a strong baseline which shows competitive performance to a recent summarization method [19]. Note that we exclude the "abstract" and the "reference" sections in the original papers and use only the paper bodies as inputs.…”

Section: Reading Materialsmentioning

confidence: 99%

A Case Study on User Evaluation of Scientific Publication Summarization by Japanese Students

et al. 2021

View full text Add to dashboard Cite

Summaries of scientific publications enable readers to gain an overview of a large number of studies, but users’ preferences have not yet been explored. In this paper, we conduct two user studies (i.e., short- and long-term studies) where Japanese university students read summaries of English research articles that were either manually written or automatically generated using text summarization and/or machine translation. In the short-term experiment, subjects compared and evaluated the two types of summaries of the same article. We analyze the characteristics in the generated summaries that readers regard as important, such as content richness and simplicity. The experimental results show that subjects are mainly judged based on four criteria, including content richness, simplicity, fluency, and format. In the long-term experiment, subjects read 50 summaries and answered whether they would like to read the original papers after reading the summaries. We discuss the characteristics in the summaries that readers tend to use to determine whether to read the papers, such as topic, methods, and results. The comments from subjects indicate that specific components of scientific publications, including research topics and methods, are important to judge whether to read or not. Our study provides insights to enhance the effectiveness of automatic summarization of scientific publications.

show abstract

What Have We Achieved on Text Summarization?

Cited by 45 publications

References 40 publications

Keep It Simple: Unsupervised Simplification of Multi-Paragraph Text

Keep It Simple: Unsupervised Simplification of Multi-Paragraph Text

WikiSum: Coherent Summarization Dataset for Efficient Human-Evaluation

A Case Study on User Evaluation of Scientific Publication Summarization by Japanese Students

Contact Info

Product

Resources

About