“…Different distributional semantics models have been developed to generate embeddings, and these have proved to adequately capture the semantic properties of words, as long as sufficiently large corpora is used ( Joulin et al, 2016 ; Mikolov et al, 2013 ). Several application scenarios have been tested including: classification of twitter streams ( Khatua, Khatua & Cambria, 2019 ; Zhang & Luo, 2019 ), plagiarism detection ( Tien et al, 2019 ), opinion mining on social networks ( Nguyen & Le Nguyen, 2018 ; Rida-E-Fatima et al, 2019 ), recommendation systems ( Chamberlain et al, 2020 ; Baek & Chung, 2021 ), mapping of scientific domain keywords ( Hu et al, 2019 ), tracking emerging scientific keywords ( Dridi et al, 2019 ), optimization of queries for Information Retrieval ( Roy et al, 2019 ; Hofstätter et al, 2019 ) or sentiment analysis ( Santosh Kumar, Yadav & Dhavale, 2021 ; Subba & Kumari, 2022 ). However, the great majority of such models has been developed for English corpora.…”