2020
DOI: 10.1017/s1351324920000078
|View full text |Cite
|
Sign up to set email alerts
|

Effective multi-dialectal arabic POS tagging

Abstract: This work introduces robust multi-dialectal part of speech tagging trained on an annotated data set of Arabic tweets in four major dialect groups: Egyptian, Levantine, Gulf, and Maghrebi. We implement two different sequence tagging approaches. The first uses conditional random fields (CRFs), while the second combines word- and character-based representations in a deep neural network with stacked layers of convolutional and recurrent networks with a CRF output layer. We successfully exploit a variety of feature… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
11
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
5
2
1

Relationship

1
7

Authors

Journals

citations
Cited by 12 publications
(11 citation statements)
references
References 22 publications
0
11
0
Order By: Relevance
“…Very recently, Inoue et al (2022) have shown benefits from using pre-trained Transformer language models, especially when transferring from high-to low-resource dialects or language varieties, outperforming previous approaches. Darwish et al (2020) introduce a robust multidialect POS tagging system trained on tweets from four different dialect groups. They implement two approaches: the first uses CRFs, and the second stacks layers of CNNs, recurrent neural networks (RNNs), and a CRF layer.…”
Section: Arabic Pos Tagging and Morphological Analysismentioning
confidence: 99%
“…Very recently, Inoue et al (2022) have shown benefits from using pre-trained Transformer language models, especially when transferring from high-to low-resource dialects or language varieties, outperforming previous approaches. Darwish et al (2020) introduce a robust multidialect POS tagging system trained on tweets from four different dialect groups. They implement two approaches: the first uses CRFs, and the second stacks layers of CNNs, recurrent neural networks (RNNs), and a CRF layer.…”
Section: Arabic Pos Tagging and Morphological Analysismentioning
confidence: 99%
“…Apache OpenNLP is a Java open -source library that is utilized for Natural Language Processing (NLP), which uses the maximum entropy principle (Darwish et al,2018). MaxEnt probability models provide a clear method for combining several pieces of contextual evidence and calculating the likelihood that a particular linguistic class will occur in a particular linguistic context with a contextual prediction as a function of:…”
Section: B Maximum Entropy Taggermentioning
confidence: 99%
“…Also, Darwish et al . (2018) developed a new dataset composed of 350 POS-tagged Arabic tweets for each of the 4 major dialects: Egyptian, Levantine, Gulf and Maghrebi.…”
Section: Nlp Resources For Arabic Dialectsmentioning
confidence: 99%