“…For NLP, back translation (Sennrich et al, 2016) is one of the most successful data augmentation approaches, which translates target-language monolingual data to the source language to generate more parallel data for MT model training. Other popular approaches include synonym replacement (Kobayashi, 2018), random deletion/swap/insertion Kumar et al, 2020), generation (Ding et al, 2020), etc. Data augmentation has also been proven to be useful in the cross-lingual settings Singh et al, 2020;Riabi et al, 2020;Qin et al, 2020;, but most of the exiting methods overlook the better utilization of multilingual training data when such resources are available.…”