“…Crosslingual word embeddings have been used to calculate distance between equivalences in different languages (Luong et al, 2015;Artetxe et al, 2016). Defauw et al (2019) treat filtering as a supervised regression problem and show that Levenshtein distance (Levenshtein, 1966) between the target and MT-translated source, as well as cosine distance between sentence embeddings of the source and target, are important features. While they use InferSent (Conneau et al, 2017), BERT (Devlin et al, 2019) has recently been employed for calculating crosslingual semantic textual similarity to detect misalignment with good results (Lo and Simard, 2019).…”