Toward More Effective Human Evaluation for Machine Translation

Saldías, Belén; Foster, George; Freitag, Markus; Tan, Qijun

doi:10.48550/arxiv.2204.05307

Cited by 2 publications

(3 citation statements)

References 12 publications

(16 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Being able to filter out exactly semantically equivalent sentence pairs would reduce this workload. Similarly, filtering out exactly semantically equivalent sentences can lessen the amount of annotation necessary for human evaluations of text (Saldías et al, 2022).…”

Section: Discussionmentioning

confidence: 99%

Measuring Fine-Grained Semantic Equivalence with Abstract Meaning Representation

Wein¹,

Wang²,

Schneider³

2022

Preprint

View full text Add to dashboard Cite

Identifying semantically equivalent sentences is important for many cross-lingual and monolingual NLP tasks. Current approaches to semantic equivalence take a loose, sentencelevel approach to "equivalence," despite previous evidence that fine-grained differences and implicit content have an effect on human understanding (Roth and Anthonio, 2021) and system performance (Briakou and Carpuat, 2021). In this work, we introduce a novel, more sensitive method of characterizing semantic equivalence that leverages Abstract Meaning Representation graph structures. We develop an approach, which can be used with either gold or automatic AMR annotations, and demonstrate that our solution is in fact finer-grained than existing corpus filtering methods and more accurate at predicting strictly equivalent sentences than existing semantic similarity metrics. We suggest that our finer-grained measure of semantic equivalence could limit the workload in the task of human post-edited machine translation and in human evaluation of sentence similarity.

show abstract

Section: Discussionmentioning

confidence: 99%

Measuring Fine-Grained Semantic Equivalence with Abstract Meaning Representation

Wein¹,

Wang²,

Schneider³

2022

Preprint

View full text Add to dashboard Cite

show abstract

“…While claims are made by recent literature (Goyal et al, 2022) that a human's involvement is timely and expensive, it cannot be absent. In order to determine acceptable values for human involvement, we rely on the past investigation in the area (Koehn, 2009;González Rubio, 2014;Way, 2018;Kreutzer et al, 2022;Saldías et al, 2022) to answer the main questions below.…”

Section: Plan B: Workaroundsmentioning

confidence: 99%

“…Other work Doherty, 2018) mentions that translation quality assessments around 60 to 70% are acceptable. For a LRMTS, the human involvement can lead to high quality LRMTS as shown by (Saldías et al, 2022).…”

Section: Plan B: Workaroundsmentioning

confidence: 99%

A Research-Based Guide for the Creation and Deployment of a Low-Resource Machine Translation System

Ortega,

Church

2023

Proceedings of the Conference Recent Advances in Natural Language Processing - Large Language Models for Natural Language Proce

View full text Add to dashboard Cite

The machine translation (MT) field seems to focus heavily on English and other high-resource languages. Though, low-resource MT (LRMT) is receiving more attention than in the past. Successful LRMT systems (LRMTS) should make a compelling business case in terms of demand, cost and quality in order to be viable for end users. When used by communities where lowresource languages are spoken, LRMT quality should not only be determined by the use of traditional metrics like BLEU, but it should also take into account other factors in order to be inclusive and not risk overall rejection by the community. MT systems based on neural methods tend to perform better with high volumes of training data, but they may be unrealistic and even harmful for LRMT. It is obvious that for research purposes, the development and creation of LRMTS is necessary. However, in this article, we argue that two main workarounds could be considered by companies that are considering deployment of LRMTS in the wild: human-in-the-loop and sub-domains.

show abstract

Toward More Effective Human Evaluation for Machine Translation

Cited by 2 publications

References 12 publications

Measuring Fine-Grained Semantic Equivalence with Abstract Meaning Representation

Measuring Fine-Grained Semantic Equivalence with Abstract Meaning Representation

A Research-Based Guide for the Creation and Deployment of a Low-Resource Machine Translation System

Contact Info

Product

Resources

About