Arabic Gloss WSD Using BERT

El-Razzaz, Mohammed; Fakhr, Mohamed Waleed; Maghraby, Fahima A.

doi:10.3390/app11062567

Cited by 18 publications

(9 citation statements)

References 20 publications

(30 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In this paper, we investigate the use of different types of signals to emphasize target words in context for Arabic WSD. El-Razzaz et al (2021) fine-tuned two BERT models on a small dataset of context-gloss pairs, consisting of about 5k lemmas, about 15k positive and 15k negative context-gloss pairs. They claimed an F1-score of 89%.…”

Section: Related Workmentioning

confidence: 99%

ArabGlossBERT: Fine-Tuning BERT on Context-Gloss Pairs for WSD

Al-Hajj¹,

Jarrar²

2021

Proceedings of the Conference Recent Advances in Natural Language Processing - Deep Learning for Natural Language Processing M

View full text Add to dashboard Cite

Section: Related Workmentioning

confidence: 99%

ArabGlossBERT: Fine-Tuning BERT on Context-Gloss Pairs for WSD

Al-Hajj¹,

Jarrar²

2021

Proceedings of the Conference Recent Advances in Natural Language Processing - Deep Learning for Natural Language Processing M

View full text Add to dashboard Cite

“…On the other hand, FP (false positive) count represents the number of class 2 samples that have been miss-classified as class 1 and vice-versa with FN (false negative) count. A specific part in the work done by El-Razzaz explains these metrics briefly and utilizes it in other evaluations [29].…”

Section: Resultsmentioning

confidence: 99%

Multimodal deep learning model for human handover classification

2022

View full text Add to dashboard Cite

Giving and receiving objects between humans and robots is a critical task which collaborative robots must be able to do. In order for robots to achieve that, they must be able to classify different types of human handover motions. Previous works did not mainly focus on classifying the motion type from both giver and receiver perspectives. However, they solely focused on object grasping, handover detection, and handover classification from one side only (giver/receiver). This paper discusses the design and implementation of different deep learning architectures with long short term memory (LSTM) network; and different feature selection techniques for human handover classification from both giver and receiver perspectives. Classification performance while using unimodal and multimodal deep learning models is investigated. The data used for evaluation is a publicly available dataset with four different modalities: motion tracking sensors readings, Kinect readings for 15 joints positions, 6-axis inertial sensor readings, and video recordings. The multimodality added a huge boost in the classification performance; achieving 96% accuracy with the feature selection based deep learning architecture.

show abstract

“…TSV has proven to be an effective solution for the WSD in many state-of-the-art efforts. Although some researchers did not use the term TSV, this notion was also referred to as GlossBERT or Context-Gloss Binary Classification (Al-Hajj and Jarrar, 2022;El-Razzaz et al, 2021). A TSV training dataset is typically a set of context-gloss pairs, each labeled with Positive or Negative.…”

Section: Related Workmentioning

confidence: 99%

“…It can be fine-tuned on domain/task-specific data (e.g., POS tagging, WSD, TSV, and WiC) to update its contextualized embeddings. The TSV task has been addressed by fine-tuning BERT on context-gloss pairs as a sentence pair binary classification problem (Huang et al, 2019;Yap et al, 2020;Patel et al, 2021;Ranjbar and Zeinali, 2021;Lin and Giambi, 2021;El-Razzaz et al, 2021;Al-Hajj and Jarrar, 2022). However, the TSV task, similar to most NLP tasks, suffers from the knowledge-gain bottleneck, i.e., the lack of available quality datasets to train machine learning models.…”

Section: Introductionmentioning

confidence: 99%

Context-Gloss Augmentation for Improving Arabic Target Sense Verification

Malaysha¹,

Jarrar²,

Khalilia³

2023

Preprint

View full text Add to dashboard Cite

Arabic language lacks semantic datasets and sense inventories. The most common semantically-labeled dataset for Arabic is the ArabGlossBERT, a relatively small dataset that consists of 167K context-gloss pairs (about 60K positive and 107K negative pairs), collected from Arabic dictionaries. This paper presents an enrichment to the ArabGlossBERT dataset, by augmenting it using (Arabic-English-Arabic) machine back-translation. Augmentation increased the dataset size to 352K pairs (149K positive and 203K negative pairs). We measure the impact of augmentation using different data configurations to fine-tune BERT on target sense verification (TSV) task. Overall, the accuracy ranges between 78% to 84% for different data configurations. Although our approach performed at par with the baseline, we did observe some improvements for some POS tags in some experiments. Furthermore, our fine-tuned models are trained on a larger dataset covering larger vocabulary and contexts. We provide an in-depth analysis of the accuracy for each part-of-speech (POS).

show abstract

Arabic Gloss WSD Using BERT

Cited by 18 publications

References 20 publications

ArabGlossBERT: Fine-Tuning BERT on Context-Gloss Pairs for WSD

ArabGlossBERT: Fine-Tuning BERT on Context-Gloss Pairs for WSD

Multimodal deep learning model for human handover classification

Context-Gloss Augmentation for Improving Arabic Target Sense Verification

Contact Info

Product

Resources

About