“…Another approach is to first train a translation system on the clean data, then use it to translate the non-English side into English and use monolingual matching methods to compare it against the English side of the parallel corpus. Different matching metrics were used: METEOR (Erdmann and Gwinnup, 2019), Levenshtein distance (Sen et al, 2019), or BLEU (Parcheta et al, 2019), Several submissions considered vocabulary coverage in their methods, preferring to add sentence pairs to the limited set that increase the number of words and n-grams covered (Erdmann and Gwinnup, 2019;Bernier-Colborne and Lo, 2019;González-Rubio, 2019).…”