2020
DOI: 10.14569/ijacsa.2020.0111045
|View full text |Cite
|
Sign up to set email alerts
|

MSTD: Moroccan Sentiment Twitter Dataset

Abstract: With the proliferation of social media and Internet accessibility, a massive amount of data has been produced. In most cases, the textual data available through the web comes mainly from people expressing their views in informal words. The Arabic language is one of the hardest Semitic languages to deal with because of its complex morphology. In this paper, a new contribution to the Arabic resources is presented as a large Moroccan dataset retrieved from Twitter and carefully annotated by native speakers. For t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
11
0
1

Year Published

2022
2022
2024
2024

Publication Types

Select...
6
2
1

Relationship

1
8

Authors

Journals

citations
Cited by 11 publications
(12 citation statements)
references
References 32 publications
0
11
0
1
Order By: Relevance
“…• Text classification (Abozinadah & Jones Jr, 2016;Abufayad, 2018;Ajlouni, 2021;AlBatayha, 2021;Habash, 2021;Mgheed, 2021). • Sentiment analysis (Al-Hagery et al, 2020;Alharbi et al, 2020;Al-Horaibi & Khan, 2016;Almutairi & Al-Hagery, 2021;Alotaibi et al, 2019;Kaibi et al, 2019Kaibi et al, , 2020Khabour et al, 2022;Mihi, Ali, et al, 2020;Mihi, Ait, et al, 2020;Mihi et al, 2022;Oussous et al, 2020). • Language model (Alzu'bi & Duwairi, 2021;Hamed et al, 2017).…”
Section: Statement Of Needunclassified
“…• Text classification (Abozinadah & Jones Jr, 2016;Abufayad, 2018;Ajlouni, 2021;AlBatayha, 2021;Habash, 2021;Mgheed, 2021). • Sentiment analysis (Al-Hagery et al, 2020;Alharbi et al, 2020;Al-Horaibi & Khan, 2016;Almutairi & Al-Hagery, 2021;Alotaibi et al, 2019;Kaibi et al, 2019Kaibi et al, , 2020Khabour et al, 2022;Mihi, Ali, et al, 2020;Mihi, Ait, et al, 2020;Mihi et al, 2022;Oussous et al, 2020). • Language model (Alzu'bi & Duwairi, 2021;Hamed et al, 2017).…”
Section: Statement Of Needunclassified
“…This dataset consists of 12k tweets, which are labeled as Negative, Objective, Positive, or Sarcastic. To [32] 223k tokens from Darija and MSA blog posts No 76k tokens Voss et al [33] corpus of tweets of Moroccan dialect written in Roman script No Unknown Laoudi et al [34] 1836 Hespress news website comments No 1.8k sequences Maghfour et al [35] 10k Facebook comments labeled for sentiment analysis No 3.5k sequences MSTD [36] 12k facilitate the analysis, two data subsets were created, one with sentiment labels and the other with a binary label for sarcasm (refer to Tables 7 and 8 for the content description).…”
Section: Sentiment Analysis and Sarcasm Automatic Detectionmentioning
confidence: 99%
“…Moroccan sentiment Twitter dataset (MSTD) [39] is a Moroccan dataset retrieved from tweets covering four-way sentiment classification. We are interested in the binary dataset.…”
Section: Datasetsmentioning
confidence: 99%