2013
DOI: 10.1007/978-3-642-38824-8_51
|View full text |Cite
|
Sign up to set email alerts
|

Code Switch Point Detection in Arabic

Abstract: This paper introduces a dual-mode stochastic system to automatically identify linguistic code switch points in Arabic. The first of these modes determines the most likely word tag (i.e. dialect or modern standard Arabic) by choosing the sequence of Arabic word tags with maximum marginal probability via lattice search and 5-gram probability estimation. When words are out of vocabulary, the system switches to the second mode which uses a dialectal Arabic (DA) and modern standard Arabic (MSA) morphological analyz… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
21
0

Year Published

2013
2013
2021
2021

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 23 publications
(21 citation statements)
references
References 1 publication
0
21
0
Order By: Relevance
“…For instance, Zbib et al (2012) show that small amounts of data from the right dialect can have a dramatic impact on the quality of Dialectal Arabic Machine Translation systems. Finally, we view the DSL task as a first step towards building a system that can identify code-switching in, for example, social media data, a task which has recently received increased attention from the NLP community 2 (Elfardy et al, 2013).…”
Section: Introductionmentioning
confidence: 99%
“…For instance, Zbib et al (2012) show that small amounts of data from the right dialect can have a dramatic impact on the quality of Dialectal Arabic Machine Translation systems. Finally, we view the DSL task as a first step towards building a system that can identify code-switching in, for example, social media data, a task which has recently received increased attention from the NLP community 2 (Elfardy et al, 2013).…”
Section: Introductionmentioning
confidence: 99%
“…Previous work on Arabic dialect identification uses n-gram based features at both word-level and character-level to identify dialectal sentences (Elfardy et al, 2013;Cotterell et al, 2014;Zaidan et al, 2014). created a dataset of dialectal Arabic.…”
Section: Previous Workmentioning
confidence: 99%
“…They performed cross-validation experiments for dialect identification using word n-gram based features. Elfardy et al (2013) built a system to distinguish between Egyptian and MSA. They used word n-gram features combined with core (token-based and perplexitybased features) and meta features for training.…”
Section: Previous Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Several studies has emerged on word-level language identification (Nguyen and Dogruöz, 2013;Das and Gambäck, 2014;cf. Solorio et al, 2014), predicting codeswitching points (Solorio and Liu, 2008a;Elfardy et al, 2013), and POS tagging (Solorio and Liu, 2008b;Vyas et al, 2014;Jamatia et al, 2015).…”
Section: Introductionmentioning
confidence: 99%