2015 IEEE/ACM 8th International Conference on Utility and Cloud Computing (UCC) 2015
DOI: 10.1109/ucc.2015.80
|View full text |Cite
|
Sign up to set email alerts
|

Using Big Data Analytics for Authorship Authentication of Arabic Tweets

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0
1

Year Published

2016
2016
2022
2022

Publication Types

Select...
4
3
1

Relationship

1
7

Authors

Journals

citations
Cited by 16 publications
(4 citation statements)
references
References 21 publications
0
3
0
1
Order By: Relevance
“…In 2015, Albadarneh et al [7] performed the earlier work that handled the AA of Arabic tweets. First, they build a dataset that comprises 53,205 tweets posted by 20 different authors.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…In 2015, Albadarneh et al [7] performed the earlier work that handled the AA of Arabic tweets. First, they build a dataset that comprises 53,205 tweets posted by 20 different authors.…”
Section: Related Workmentioning
confidence: 99%
“…With more than 10.5 million tweets per day [5], Arabic is among the top five dominant languages on Twitter [6]. However, the first work dealing with AA of Arabic tweets is attributed to a publication dating back to 2015 [7]. In fact, during the following three years, barely ten published works that addressed Arabic AA [8].…”
Section: Introductionmentioning
confidence: 99%
“…SA is viewed as a variation of the classical text categorization (TC) problem where the classes are simply the different polarities. This is not odd as many problems are approached as TC problems such as spam filtering (Aggarwal & Zhai, 2012), determining author's characteristics such as identity (Juola, 2006;Stamatatos, 2009;Alwajeeh et al, 2014;Albadarneh et al, 2015), demographic and psychometric traits (Estival et al, 2007), gender (Cheng, Chandramouli, & Subbalakshmi, 2011;Alsmearat et al, 2014Alsmearat et al, , 2015, dialect (Zaidan & Callison-Burch, 2013), native language (Tetreault, Burstein, & Leacock, 2013), political orientation (Koppel, Akiva, Alshech, & Bar, 2009;Abooraig et al, 2014), etc.. The literature proposes two main approaches for TC: corpus-based (Hmeidi et al, 2015a) and lexicon-based (Ahmed et al, 2015).…”
Section: Related Workmentioning
confidence: 99%
“…Diante do problema de Big Data e dados streaming, abordagens tradicionais de treinamento de classificadores podem ser um problema. Com isso, Seker, Al-Naami e Khan (2013) abordaram o problema de Atribuição de Autoria em dados streming, os autores mostraram que a taxa de erro tende a diminuir com o aumento do tamanho dos chunks apresentado ao classificador.Ainda na questão de Big Data,Albadarneh et al (2015) trazem esse tema para o problema de autenticação de autoria em tweets árabes, cuja língua possui uma série de dialetos e recursos de NLP escassos. Foi utilizada a representação de BOW em conjunto com TF-IDF.…”
unclassified