Proceedings of the 55th Annual Meeting of the Association For Computational Linguistics (Volume 2: Short Papers) 2017
DOI: 10.18653/v1/p17-2072
|View full text |Cite
|
Sign up to set email alerts
|

Methodical Evaluation of Arabic Word Embeddings

Abstract: Many unsupervised learning techniques have been proposed to obtain meaningful representations of words from text. In this study, we evaluate these various techniques when used to generate Arabic word embeddings. We first build a benchmark for the Arabic language that can be utilized to perform intrinsic evaluation of different word embeddings. We then perform additional extrinsic evaluations of the embeddings based on two NLP tasks.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
13
0

Year Published

2018
2018
2021
2021

Publication Types

Select...
5
4
1

Relationship

0
10

Authors

Journals

citations
Cited by 18 publications
(13 citation statements)
references
References 5 publications
0
13
0
Order By: Relevance
“…These vectors capture semantic information between words; the words with similar meaning will have vectors closer to each other. Building word embedding model using a large-scale training dataset is important to obtain meaningful embeddings [47]. We built a word vectors model exploiting our whole COVID-19 dataset collected from January 2020 to April 2020.…”
Section: Misinformation Headline Tweet Examplesmentioning
confidence: 99%
“…These vectors capture semantic information between words; the words with similar meaning will have vectors closer to each other. Building word embedding model using a large-scale training dataset is important to obtain meaningful embeddings [47]. We built a word vectors model exploiting our whole COVID-19 dataset collected from January 2020 to April 2020.…”
Section: Misinformation Headline Tweet Examplesmentioning
confidence: 99%
“…Word Embedding tools, technologies and pre-trained models are widely available for resource rich languages such as English (Mikolov et al, 2013;Pennington et al, 2014; and Chinese (Li et al, 2018;Chen et al, 2015). Due to the wide use of Word Embeddings, pre-trained models are increasingly available for resource poor languages such as Portuguese (Hartmann et al, 2017), Arabic (Elrazzaz et al, 2017;Soliman et al, 2017), and Bengali (Ahmad and Amin, 2016).…”
Section: Related Workmentioning
confidence: 99%
“…Most of the proposed evaluation schemes are based on word analogies that were presented in (Mikolov et al, 2013b) for the English language. For the Arabic language (Elrazzaz et al, 2017), a benchmark has been created so that it can be utilized to perform intrinsic evaluation of different word embeddings.…”
Section: Related Workmentioning
confidence: 99%