“…Most pairs are from previous WMT (Gu, Kk, Tr, Ro, Et, Lt, Fi, Lv, Cs, Es, Zh, De, Ru, Fr ↔ En) and IWSLT (Vi, Ja, Ko, Nl, Ar, It ↔ En) competitions. We also use FLoRes pairs , En-Ne and En-Si), En-Hi from IITB (Kunchukuttan et al, 2017), and En-My from WAT19 (Ding et al, 2018(Ding et al, , 2019. We divide the datasets into three categories-low resource (<1M sentence pairs), medium resource (>1M and <10M), and high resource (>10M).…”