MoverScore: Text Generation Evaluating with Contextualized Embeddings and Earth Mover Distance

Zhao, Wei; Peyrard, Maxime; Liu, Fei; Gao, Yang; Meyer, Christian M.; Eger, Steffen

doi:10.18653/v1/d19-1053

Cited by 316 publications

(310 citation statements)

References 46 publications

Supporting

Mentioning

263

Contrasting

Order By: Relevance

“…5 For reference, we report the standard summarization baselines described in the previous section. The summaries are evaluated with 2 automatic evaluation metrics: ROUGE-2 recall with stopwords removed (R-2) (Lin, 2004) and a recent BERT-based evaluation metric (MOVER) (Zhao et al, 2019). The results, reported in Table 4, are encouraging since the systems based on the learned priors outperform the uniform prior.…”

Section: Extracting Summaries: Examplementioning

confidence: 92%

KLearn: Background Knowledge Inference from Summarization Data

Peyrard

West

2020

Findings of the Association for Computational Linguistics: EMNLP 2020

Self Cite

View full text Add to dashboard Cite

The goal of text summarization is to compress documents to the relevant information while excluding background information already known to the receiver. So far, summarization researchers have given considerably more attention to relevance than to background knowledge. In contrast, this work puts background knowledge in the foreground. Building on the realization that the choices made by human summarizers and annotators contain implicit information about their background knowledge, we develop and compare techniques for inferring background knowledge from summarization data. Based on this framework, we define summary scoring functions that explicitly model background knowledge, and show that these scoring functions fit human judgments significantly better than baselines. We illustrate some of the many potential applications of our framework. First, we provide insights into human information importance priors. Second, we demonstrate that averaging the background knowledge of multiple, potentially biased annotators or corpora greatly improves summary-scoring performance. Finally, we discuss potential applications of our framework beyond summarization.

show abstract

Section: Extracting Summaries: Examplementioning

confidence: 92%

KLearn: Background Knowledge Inference from Summarization Data

Peyrard

West

2020

Findings of the Association for Computational Linguistics: EMNLP 2020

Self Cite

View full text Add to dashboard Cite

show abstract

“…MoverScore employs a contextualized embedding model and a variant of the Earth Mover Distance (Rubner et al, 2000) to measure the similarity between sentence-pairs (Zhao et al, 2019). Given two sentences, MoverScore aligns similar words from each sentence and computes the flow traveling between these words.…”

Section: Baseline Modelsmentioning

confidence: 99%

RecoBERT: A Catalog Language Model for Text-Based Recommendations

Malkiel

Barkan

Caciularu

et al. 2020

Findings of the Association for Computational Linguistics: EMNLP 2020

View full text Add to dashboard Cite

Language models that utilize extensive selfsupervised pre-training from unlabeled text, have recently shown to significantly advance the state-of-the-art performance in a variety of language understanding tasks. However, it is yet unclear if and how these recent models can be harnessed for conducting text-based recommendations. In this work, we introduce RecoBERT, a BERT-based approach for learning catalog-specialized language models for text-based item recommendations. We suggest novel training and inference procedures for scoring similarities between pairs of items, that don't require item similarity labels. Both the training and the inference techniques were designed to utilize the unlabeled structure of textual catalogs, and minimize the discrepancy between them. By incorporating four scores during inference, RecoBERT can infer text-based item-to-item similarities more accurately than other techniques. In addition, we introduce a new language understanding task for wine recommendations using similarities based on professional wine reviews. As an additional contribution, we publish annotated recommendations dataset crafted by human wine experts. Finally, we evaluate Re-coBERT and compare it to various state-of-theart NLP models on wine and fashion recommendations tasks.

show abstract

“…Lin proposed a framework based on global encoding, which used convolution gating unit to control the information flow from the encoder to decoder according to the global information of input context. Wei [Wei et al 2019] proposed a regularization approach to the sequence-to-sequence model for the Chinese social media summarization task, which could improve the semantic consistency. Based on a double attention pointer network, Li [Li et al 2020] proposed an encoder-decoder model achieved higher summarization performance on the CNN/Daily Mail dataset and the LCSTS dataset.…”

Section: Related Workmentioning

confidence: 99%

“…Global Encoding for Long Chinese Text Summarization 84:11 Some evaluation methods (ROUGE [Lin and Hovy 2003], BLEU [Papineni et al 2002], Mover-Score [Zhao et al 2019]) are adopted in text summarization. As we all know, the evaluation of large-scale summarization models is costly and cumbersome.…”

Section: Settingmentioning

confidence: 99%

“…The latter can generate different words and phrases, rather than copy them from the source text. Based on a sequence-to-sequence (Seq2Seq) model that corresponds to neural machine translation ], many abstractive summarization methods have performed well on short English text datasets [Cheng and Lapata 2016;Chopra et al 2016;Luong et al 2015;Rush et al 2015;See et al 2017], long English text datasets [Celikyilmaz et al 2018;Cohan et al 2018;Dong et al 2018;Fabbri et al 2019;Gehrmann et al 2018;Kingma and Ba 2019;Li et al 2018;Liu et al 2018;Liu and Lapata 2019;Zhou et al 2018], and short Chinese text datasets [Hu et al 2015;Li et al 2020;Wei et al 2019]. However, it is infrequent to explore text summarization on long Chinese text due to the lack of high-quality long Chinese text summarization datasets.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Global Encoding for Long Chinese Text Summarization

Zhou

2020

ACM Trans. Asian Low-Resour. Lang. Inf. Process.

View full text Add to dashboard Cite

Text summarization is one of the significant tasks of natural language processing, which automatically converts text into a summary. Some summarization systems, for short/long English, and short Chinese text, benefit from advances in the neural encoder-decoder model because of the availability of large datasets. However, the long Chinese text summarization research has been limited to datasets of a couple of hundred instances. This article aims to explore the long Chinese text summarization task. To begin with, we construct a first large-scale, long Chinese text summarization corpus, the Long Chinese Summarization of Police Inquiry Record Text (LCSPIRT). Based on this corpus, we propose a sequence-to-sequence (Seq2Seq) model that incorporates a global encoding process with an attention mechanism. Our model achieves a competitive result on the LCSPIRT corpus compared with several benchmark methods.

show abstract

MoverScore: Text Generation Evaluating with Contextualized Embeddings and Earth Mover Distance

Cited by 316 publications

References 46 publications

KLearn: Background Knowledge Inference from Summarization Data

KLearn: Background Knowledge Inference from Summarization Data

RecoBERT: A Catalog Language Model for Text-Based Recommendations

Global Encoding for Long Chinese Text Summarization

Contact Info

Product

Resources

About