Using Big Data Analytics for Authorship Authentication of Arabic Tweets

Albadarneh, Jafar; Talafha, Bashar; Al‐Ayyoub, Mahmoud; Zaqaibeh, Belal; Al-Smadi, Mohammad; Jararweh, Yaser; Benkhelifa, Elhadj

doi:10.1109/ucc.2015.80

Cited by 16 publications

(4 citation statements)

References 21 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…In 2015, Albadarneh et al [7] performed the earlier work that handled the AA of Arabic tweets. First, they build a dataset that comprises 53,205 tweets posted by 20 different authors.…”

Section: Related Workmentioning

confidence: 99%

See 1 more Smart Citation

Arabic authorship attribution on Twitter: what is really matters?

Kah¹,

airej

Zeroual

2022

IJEECS

View full text Add to dashboard Cite

Recently, authorship attribution (AA) of online social networks texts has gained more attention. However, since 2015, when the first work that addressed the AA of Arabic tweets was published, we found that nothing much has been done after that. Thus, the current paper presents an extensive study that investigates the effects of various factors on the AA of Arabic short-texts, especially tweets. This led to a proposed architecture in which the AA accuracy is examined depending on the size of the training dataset, the number of classes covered, the text processing techniques applied, the methods used for both feature selection and extraction, and finally, the classifier implemented. As a result, we performed 792 different tests. The highest accuracy recorded is 97.4%, and it is among the best results published so far.

show abstract

“…In 2015, Albadarneh et al [7] performed the earlier work that handled the AA of Arabic tweets. First, they build a dataset that comprises 53,205 tweets posted by 20 different authors.…”

Section: Related Workmentioning

confidence: 99%

“…With more than 10.5 million tweets per day [5], Arabic is among the top five dominant languages on Twitter [6]. However, the first work dealing with AA of Arabic tweets is attributed to a publication dating back to 2015 [7]. In fact, during the following three years, barely ten published works that addressed Arabic AA [8].…”

Section: Introductionmentioning

confidence: 99%

Arabic authorship attribution on Twitter: what is really matters?

Kah¹,

airej

Zeroual

2022

IJEECS

View full text Add to dashboard Cite

show abstract

“…SA is viewed as a variation of the classical text categorization (TC) problem where the classes are simply the different polarities. This is not odd as many problems are approached as TC problems such as spam filtering (Aggarwal & Zhai, 2012), determining author's characteristics such as identity (Juola, 2006;Stamatatos, 2009;Alwajeeh et al, 2014;Albadarneh et al, 2015), demographic and psychometric traits (Estival et al, 2007), gender (Cheng, Chandramouli, & Subbalakshmi, 2011;Alsmearat et al, 2014Alsmearat et al, , 2015, dialect (Zaidan & Callison-Burch, 2013), native language (Tetreault, Burstein, & Leacock, 2013), political orientation (Koppel, Akiva, Alshech, & Bar, 2009;Abooraig et al, 2014), etc.. The literature proposes two main approaches for TC: corpus-based (Hmeidi et al, 2015a) and lexicon-based (Ahmed et al, 2015).…”

Section: Related Workmentioning

confidence: 99%

Using Enhanced Lexicon-Based Approaches for the Determination of Aspect Categories and Their Polarities in Arabic Reviews

Jararweh

Al‐Ayyoub

Al-Smadi

et al. 2016

International Journal of Information Technology and Web Engineering

Self Cite

View full text Add to dashboard Cite

Sentiment Analysis (SA) is the process of determining the sentiment of a text written in a natural language to be positive, negative or neutral. It is one of the most interesting subfields of natural language processing (NLP) and Web mining due to its diverse applications and the challenges associated with applying it on the massive amounts of textual data available online (especially, on social networks). Most of the current work on SA focus on the English language and work on the sentence-level or the document-level. This work focuses on the less studied version of SA, which is aspect-based SA (ABSA) for the Arabic language. Specifically, this work considers two ABSA tasks: aspect category determination and aspect category polarity determination, and makes use of the publicly available human annotated Arabic dataset (HAAD) along with its baseline experiments conducted by HAAD providers. In this work, several lexicon-based approaches are presented for the two tasks at hand and show that some of the presented approaches significantly outperforms the best-known result on the given dataset. An enhancement of 9% and 46% were achieved in the tasks aspect category determination and aspect category polarity determination respectively.

show abstract

“…Diante do problema de Big Data e dados streaming, abordagens tradicionais de treinamento de classificadores podem ser um problema. Com isso, Seker, Al-Naami e Khan (2013) abordaram o problema de Atribuição de Autoria em dados streming, os autores mostraram que a taxa de erro tende a diminuir com o aumento do tamanho dos chunks apresentado ao classificador.Ainda na questão de Big Data,Albadarneh et al (2015) trazem esse tema para o problema de autenticação de autoria em tweets árabes, cuja língua possui uma série de dialetos e recursos de NLP escassos. Foi utilizada a representação de BOW em conjunto com TF-IDF.…”

unclassified

Atribuição de autoria em dados temporais utilizando a rede social Reddit

Casimiro¹

View full text Add to dashboard Cite

show abstract

Using Big Data Analytics for Authorship Authentication of Arabic Tweets

Cited by 16 publications

References 21 publications

Arabic authorship attribution on Twitter: what is really matters?

Arabic authorship attribution on Twitter: what is really matters?

Using Enhanced Lexicon-Based Approaches for the Determination of Aspect Categories and Their Polarities in Arabic Reviews

Atribuição de autoria em dados temporais utilizando a rede social Reddit

Contact Info

Product

Resources

About