“…Previous research has shown that many pre-trained models, such as GPT-2 (Radford et al, 2019), ELMo (Peters et al, 2018), BERT, and RoBERTa , have degenerated embedding spaces that downgrade their semantic expressiveness (Ethayarajh, 2019;Cai et al, 2021;Rajaee and Pilehvar, 2021). Several proposals have been put forward to overcome this challenge (Gao et al, 2019;Zhang et al, 2020). However, to our knowledge, no study has so far been conducted on the degeneration problem in the multilingual embedding space.…”