chrF: character n-gram F-score for automatic MT evaluation

Popović, Maja

doi:10.18653/v1/w15-3049

Cited by 613 publications

(532 citation statements)

References 6 publications

Supporting

Mentioning

525

Contrasting

Unclassified

Order By: Relevance

“…We include several evaluation metrics: BLEU (Papineni et al, 2002), NIST (Doddington, 2002), TER (Snover et al, 2006), METEOR (Banerjee and Lavie, 2005) and CHRF (Popovic, 2015). These scores give an estimation of the quality of the output of the experiment when comparing to a translated reference.…”

Section: Methodsmentioning

confidence: 99%

Applying N-gram Alignment Entropy to Improve Feature Decay Algorithms

Poncelas

Wenniger

Way

2017

The Prague Bulletin of Mathematical Linguistics

View full text Add to dashboard Cite

Data Selection is a popular step in Machine Translation pipelines. Feature Decay Algorithms (FDA) is a technique for data selection that has shown a good performance in several tasks. FDA aims to maximize the coverage of n-grams in the test set. However, intuitively, more ambiguous n-grams require more training examples in order to adequately estimate their translation probabilities. This ambiguity can be measured by alignment entropy. In this paper we propose two methods for calculating the alignment entropies for n-grams of any size, which can be used for improving the performance of FDA. We evaluate the substitution of the n-gramspecific entropy values computed by these methods to the parameters of both the exponential and linear decay factor of FDA. The experiments conducted on German-to-English and Czechto-English translation demonstrate that the use of alignment entropies can lead to an increase in the quality of the results of FDA.

show abstract

Section: Methodsmentioning

confidence: 99%

Applying N-gram Alignment Entropy to Improve Feature Decay Algorithms

Poncelas

Wenniger

Way

2017

The Prague Bulletin of Mathematical Linguistics

View full text Add to dashboard Cite

show abstract

“…Because our method involves transliteration, which is applied at a character level, we found it also useful to evaluate the output with character-based metrics, which reward some translations even if the morphology is not completely correct. For this reason, we additionally report BEER (Stanojević and Sima'an 2014) and chrF3 (Popović 2015) scores.…”

Section: Neural Machine Translation Systemmentioning

confidence: 99%

Neural machine translation for low-resource languages without parallel corpora

Karakanta

Dehdari

Genabith

2017

Machine Translation

View full text Add to dashboard Cite

The problem of a total absence of parallel data is present for a large number of language pairs and can severely detriment the quality of machine translation. We describe a language-independent method to enable machine translation between a low-resource language (LRL) and a third language, e.g. English. We deal with cases of LRLs for which there is no readily available parallel data between the low-resource language and any other language, but there is ample training data between a closelyrelated high-resource language (HRL) and the third language. We take advantage of the similarities between the HRL and the LRL in order to transform the HRL data into data similar to the LRL using transliteration. The transliteration models are trained on transliteration pairs extracted from Wikipedia article titles. Then, we automatically back-translate monolingual LRL data with the models trained on the transliterated HRL data and use the resulting parallel corpus to train our final models. Our method achieves significant improvements in translation quality, close to the results that can be achieved by a general purpose neural machine translation system trained on a significant amount of parallel data. Moreover, the method does not rely on the existence of any parallel data for training, but attempts to bootstrap already existing resources in a related language.

show abstract

“…• modified CHRF 3 (Popović, 2015) to compute character n-grams split by word boundary space with n ∈ [3, 7] whereas the F 1 (Biçici, 2011) we already use compute with word n-grams up to n = 5.…”

Section: Referential Translation Machinesmentioning

confidence: 99%

Predicting Translation Performance with Referential Translation Machines

Biçici¹

2017

Proceedings of the Second Conference on Machine Translation

View full text Add to dashboard Cite

Referential translation machines achieve top performance in both bilingual and monolingual settings without accessing any task or domain specific information or resource. RTMs achieve the 3rd system results for German to English sentence-level prediction of translation quality and the 2nd system results according to root mean squared error. In addition to the new features about substring distances, punctuation tokens, character n-grams, and alignment crossings, and additional learning models, we average prediction scores from different models using weights based on their training performance for improved results.

show abstract

chrF: character n-gram F-score for automatic MT evaluation

Cited by 613 publications

References 6 publications

Applying N-gram Alignment Entropy to Improve Feature Decay Algorithms

Applying N-gram Alignment Entropy to Improve Feature Decay Algorithms

Neural machine translation for low-resource languages without parallel corpora

Predicting Translation Performance with Referential Translation Machines

Contact Info

Product

Resources

About