“…OPUS 7 contains more than 2.7 billion parallel sentences in 90 languages. The specific corpus we extracted consists of data from multiple domains and sources including: ParaCrawl project (Esplà-Gomis et al, 2019), EUbookshop (Skadiņš et al, 2014), Tilde Model (Rozis and Skadinš, 2017), translation memories (DGT) (Steinberger et al, 2013), Open-Subtitles (Creutz, 2018), SciELO Parallel (Soares et al, 2018), JRC-Acquis Multilingual (Steinberger et al, 2006), Tanzil (Zarrabi-Zadeh, 2007, Eu-roparl Parallel (Koehn, 2005), TED 2013 (Cettolo et al, 2012), Wikipedia (Wołk and Marasek, 2014), Tatoeba 8 , QCRI Educational Domain (Abdelali et al, 2014), GNOME localization files, 9 Global Voices, 10 KDE4, 11 , Ubuntu, 12 and Multilingual Bible (Christodouloupoulos and Steedman, 2015).…”