Salam Khalifa scite author profile

Salam Khalifa

5Publications

78Citation Statements Received

104Citation Statements Given

How they've been cited

126

How they cite others

112

Affiliations

New York University Abu Dhabi, University of Sharjah

Publications

Order By: Most citations

Improving Arabic Diacritization through Syntactic Analysis

Shahrour

Khalifa

Habash

2015

View full text Add to dashboard Cite

We present an approach to Arabic automatic diacritization that integrates syntactic analysis with morphological tagging through improving the prediction of case and state features. Our best system increases the accuracy of word diacritization by 2.5% absolute on all words, and 5.2% absolute on nominals over a state-of-theart baseline. Similar increases are shown on the full morphological analysis choice.

show abstract

A Morphological Analyzer for Gulf Arabic Verbs

Khalifa¹,

Hassan²,

Habash³

2017

View full text Add to dashboard Cite

We present CALIMA GLF , a Gulf Arabic morphological analyzer currently covering over 2,600 verbal lemmas. We describe in detail the process of building the analyzer starting from phonetic dictionary entries to fully inflected orthographic paradigms and associated lexicon and orthographic variants. We evaluate the coverage of CALIMA GLF against Modern Standard Arabic and Egyptian Arabic analyzers on part of a Gulf Arabic novel. CALIMA GLF verb analysis token recall for identifying correct POS tag outperforms both the Modern Standard Arabic and Egyptian Arabic analyzers by over 27.4% and 16.9% absolute, respectively.

show abstract

An Arabic Morphological Analyzer and Generator with Copious Features

Taji¹,

Khalifa²,

Obeid³

et al. 2018

View full text Add to dashboard Cite

We introduce CALIMA Star , a very rich Arabic morphological analyzer and generator that provides functional and form-based morphological features as well as built-in tokenization, phonological representation, lexical rationality and much more. This tool includes a fast engine that can be easily integrated into other systems, as well as an easy-to-use API and a web interface. CALIMA Star also supports morphological reinflection. We evaluate CALIMA Star against four commonly used analyzers for Arabic in terms of speed and morphological content.

show abstract

SIGMORPHON–UniMorph 2022 Shared Task 0: Modeling Inflection in Language Acquisition

Kodner¹,

Khalifa²

2022

View full text Add to dashboard Cite

This year's iteration of the SIGMORPHON-UniMorph shared task on "human-like" morphological inflection generation focuses on generalization and errors in language acquisition. Systems are trained on data sets extracted from corpora of child-directed speech in order to simulate a natural learning setting, and their predictions are evaluated against what is known about children's developmental trajectories for three well-studied patterns: English past tense, German noun plurals, and Arabic noun plurals. Three submitted neural systems were evaluated together with two baselines. Performance was generally good, and all systems were prone to human-like over-regularization. However, all systems were also prone to non-human-like over-irregularization and nonsense productions to varying degrees. We situate this behavior in a discussion of the Past Tense Debate. 1

show abstract

Morphosyntactic Tagging with Pre-trained Language Models for Arabic and its Dialects

Inoue¹,

Khalifa²,

Habash³

2022

View full text Add to dashboard Cite

We present state-of-the-art results on morphosyntactic tagging across different varieties of Arabic using fine-tuned pre-trained transformer language models. Our models consistently outperform existing systems in Modern Standard Arabic and all the Arabic dialects we study, achieving 2.6% absolute improvement over the previous state-of-the-art in Modern Standard Arabic, 2.8% in Gulf, 1.6% in Egyptian, and 8.3% in Levantine. We explore different training setups for fine-tuning pre-trained transformer language models, including training data size, the use of external linguistic resources, and the use of annotated data from other dialects in a low-resource scenario. Our results show that strategic fine-tuning using datasets from other high-resource dialects is beneficial for a low-resource dialect. Additionally, we show that high-quality morphological analyzers as external linguistic resources are beneficial especially in low-resource settings.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Salam Khalifa

Improving Arabic Diacritization through Syntactic Analysis

A Morphological Analyzer for Gulf Arabic Verbs

An Arabic Morphological Analyzer and Generator with Copious Features

SIGMORPHON–UniMorph 2022 Shared Task 0: Modeling Inflection in Language Acquisition

Morphosyntactic Tagging with Pre-trained Language Models for Arabic and its Dialects

Contact Info

Product

Resources

About