Proceedings of the 21st ACM International Conference on Information and Knowledge Management 2012
DOI: 10.1145/2396761.2398658
|View full text |Cite
|
Sign up to set email alerts
|

Language processing for arabic microblog retrieval

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
62
0

Year Published

2013
2013
2019
2019

Publication Types

Select...
5
4
1

Relationship

3
7

Authors

Journals

citations
Cited by 70 publications
(62 citation statements)
references
References 5 publications
0
62
0
Order By: Relevance
“…Tweets and user locations were normalized and cleaned in the manner described in Darwish et al (2012) by mapping frequent non-Arabic characters and decoration to their mappings, handling repeated characters, etc. Below in an example that shows a tweet before and after normalization: Before: mbrwwwwwwk yA bA$A.…”
Section: Tweet Normalizationmentioning
confidence: 99%
“…Tweets and user locations were normalized and cleaned in the manner described in Darwish et al (2012) by mapping frequent non-Arabic characters and decoration to their mappings, handling repeated characters, etc. Below in an example that shows a tweet before and after normalization: Before: mbrwwwwwwk yA bA$A.…”
Section: Tweet Normalizationmentioning
confidence: 99%
“…We plan to release the tweet ID's and our annotations. We preprocessed the training and test sets using the method described by Darwish et al (2012), which includes performing letter and word normalizations, and segmented all data using an open-source MSA word segmentor (Darwish et al, 2012). We also removed punctuations, hashtags, and name mentions from the test set.…”
Section: Evaluation Setupmentioning
confidence: 99%
“…-Basic normalization dataset is the normalized dataset with the basic Arabic normalization process to correct the most common Arabic misspellings. This is the same as the one that was used in [9], [26]:…”
Section: Sub-datasetmentioning
confidence: 99%