Adversarial Multitask Learning for Joint Multi-Feature and Multi-Dialect Morphological Modeling

Zalmout, Nasser; Habash, Nizar

doi:10.18653/v1/p19-1173

Cited by 16 publications

(21 citation statements)

References 33 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The difference between their work and that in (Zalmout and Habash, 2017) is the use of a joint model to learn morphological features other than diacritics (or features at the word level), rather than learning these features individually. Zalmout and Habash (2019a) obtained an additional boost in performance (0.3% improvement over ours) when they add a dialect variant of Arabic in the learning process, sharing information between both languages. Alqahtani and Diab (2019a) provides comparable performance to ALL and better performance on some task combinations in terms of WER on all and OOV words.…”

Section: Input Representationmentioning

confidence: 65%

“…However, Zalmout and Habash (2017)'s model performs significantly better on OOV words. Zalmout and Habash (2019a) provides comparable performance to ALL model. The difference between their work and that in (Zalmout and Habash, 2017) is the use of a joint model to learn morphological features other than diacritics (or features at the word level), rather than learning these features individually.…”

Section: Input Representationmentioning

confidence: 89%

See 1 more Smart Citation

A Multitask Learning Approach for Diacritic Restoration

Alqahtani

Mishra

Diab

2020

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

View full text Add to dashboard Cite

In many languages like Arabic, diacritics are used to specify pronunciations as well as meanings. Such diacritics are often omitted in written text, increasing the number of possible pronunciations and meanings for a word. This results in a more ambiguous text making computational processing on such text more difficult. Diacritic restoration is the task of restoring missing diacritics in the written text. Most state-of-the-art diacritic restoration models are built on character level information which helps generalize the model to unseen data, but presumably lose useful information at the word level. Thus, to compensate for this loss, we investigate the use of multi-task learning to jointly optimize diacritic restoration with related NLP problems namely word segmentation, part-of-speech tagging, and syntactic diacritization. We use Arabic as a case study since it has sufficient data resources for tasks that we consider in our joint modeling. Our joint models significantly outperform the baselines and are comparable to the state-ofthe-art models that are more complex relying on morphological analyzers and/or a lot more data (e.g. dialectal data).

show abstract

Section: Input Representationmentioning

confidence: 65%

Section: Input Representationmentioning

confidence: 89%

A Multitask Learning Approach for Diacritic Restoration

Alqahtani

Mishra

Diab

2020

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

View full text Add to dashboard Cite

show abstract

“…The tagging architecture is similar to the architecture presented by Zalmout and Habash (2019). We use two Bi-LSTM layers on the word level to model the context for each direction of the target word.…”

Section: Taggermentioning

confidence: 99%

“…Whereas to get the a j vector, for each morphological feature f , we use a morphological analyzer to obtain all possible feature values of the word to be analyzed. We then embed each value separately (with separate embedding tensors for each feature, learnt within the model), then sum all the resulting vectors to to get a f j (since these tags are alternatives and do not constitute a sequence) (Zalmout and Habash, 2019). We concatenate the individual a f j vectors for each morphological feature f of each word, to get a single representation, a j , for all the features:…”

Section: Taggermentioning

confidence: 99%

Joint Diacritization, Lemmatization, Normalization, and Fine-Grained Morphological Tagging

Zalmout

Habash

2020

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Self Cite

View full text Add to dashboard Cite

The written forms of Semitic languages are both highly ambiguous and morphologically rich: a word can have multiple interpretations and is one of many inflected forms of the same concept or lemma. This is further exacerbated for dialectal content, which is more prone to noise and lacks a standard orthography. The morphological features can be lexicalized, like lemmas and diacritized forms, or non-lexicalized, like gender, number, and partof-speech tags, among others. Joint modeling of the lexicalized and non-lexicalized features can identify more intricate morphological patterns, which provide better context modeling, and further disambiguate ambiguous lexical choices. However, the different modeling granularity can make joint modeling more difficult. Our approach models the different features jointly, whether lexicalized (on the characterlevel), or non-lexicalized (on the word-level). We use Arabic as a test case, and achieve stateof-the-art results for Modern Standard Arabic with 20% relative error reduction, and Egyptian Arabic with 11% relative error reduction.

show abstract

“…Later, a number of annotation efforts have led to the creation of varying sizes of dialectal annotated corpora following the style of the PATB (Maamouri et al, 2014;Jarrar et al, 2016;Al-Shargi et al, 2016;Alshargi et al, 2019). The created annotations supported models for dialectal Arabic analysis, disambiguation and tokenization building on the same successful approaches in MSA (Eskander et al, 2016a;Habash et al, 2013;Pasha et al, 2014;Zalmout and Habash, 2019). More closely related to this paper, Eldesouki et al (2017) used de-lexicalized analy-sis strategy for four colloquial varieties of Arabic, though they also use minimal training data and extract features from an open class lexicon to learn either an SVM or bi-LSTM-CRF disambiguation model.…”

Section: Dialectal Arabic Models Work On Dialectalmentioning

confidence: 99%

A Little Linguistics Goes a Long Way: Unsupervised Segmentation with Limited Language Specific Guidance

Erdmann¹,

Khalifa²,

Oudah³

et al. 2019

Proceedings of the 16th Workshop on Computational Research in Phonetics, Phonology, and Morphology

Self Cite

View full text Add to dashboard Cite

We present de-lexical segmentation, a linguistically motivated alternative to greedy or other unsupervised methods, requiring language specific knowledge, but no direct supervision. Our technique involves creating a small grammar of closed-class affixes which can be written in a few hours. The grammar over generates analyses for word forms attested in a raw corpus which are disambiguated based on features of the linguistic base proposed for each form. Extending the grammar to cover orthographic, morphosyntactic or lexical variation is simple, making it an ideal solution for challenging corpora with noisy, dialect-inconsistent, or otherwise non-standard content. We demonstrate the utility of de-lexical segmentation on several dialects of Arabic. We consistently outperform competitive unsupervised baselines and approach the performance of state-of-the-art supervised models trained on large amounts of data, providing evidence for the value of linguistic input during preprocessing.

show abstract

Adversarial Multitask Learning for Joint Multi-Feature and Multi-Dialect Morphological Modeling

Cited by 16 publications

References 33 publications

A Multitask Learning Approach for Diacritic Restoration

A Multitask Learning Approach for Diacritic Restoration

Joint Diacritization, Lemmatization, Normalization, and Fine-Grained Morphological Tagging

A Little Linguistics Goes a Long Way: Unsupervised Segmentation with Limited Language Specific Guidance

Contact Info

Product

Resources

About