“…Code-switching in NLP has seen a rise of interest in recent years, including a dedicated workshop starting in 2014 (Diab et al, 2014) and still ongoing (Solorio et al, 2021). CS in machine translation also has a long history (Le Féal, 1990;Climent et al, 2003;Sinha and Thakur, 2005;Johnson et al, 2017;Elmadany et al, 2021;Xu and Yvon, 2021), but has seen a rise of interest with the advent of large multilingual models such as mBART (Liu et al, 2020) or mT5 (Xue et al, 2020;Gautam et al, 2021;Jawahar et al, 2021). Due to the lack of available CS data and the ease of single-word translation, most of these recent related MT works have synthetically created CS data for either training or testing by translating one or more of the words in a sentence (Song et al, 2019;Nakayama et al, 2019;Xu and Yvon, 2021;.…”