“…For example, multilingual BERT (Devlin et al, 2019) was trained on Wikipedia articles from more than 100 languages. Although performance improvements show the possibility to use multilingual BERT in monolingual (Hakala and Pyysalo, 2019), multilingual (Tsai et al, 2019) and cross-lingual settings (Wu and Dredze, 2019), it has been questioned whether multilingual BERT is truly multilingual (Pires et al, 2019;Singh et al, 2019;Libovickỳ et al, 2019). Therefore, we will investigate the benefits of aligning its embeddings in our experiments.…”