2020
DOI: 10.14569/ijacsa.2020.0111128
|View full text |Cite
|
Sign up to set email alerts
|

SDCT: Multi-Dialects Corpus Classification for Saudi Tweets

Abstract: There is an increasing demand for analyzing the contents of social media. However, the process of sentiment analysis in Arabic language especially Arabic dialects can be very complex and challenging. This paper presents details of collecting and constructing a classified corpus of 4180 multi-dialectal Saudi tweets (SDCT). The tweets were annotated manually by five native speakers in two stages. The first stage annotated the tweets as Hijazi, Najdi, and Eastern based on some Saudi regions. The second stage anno… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
2
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(3 citation statements)
references
References 16 publications
0
2
0
Order By: Relevance
“…Within these varieties, abundant accents and linguistic peculiarities serve as distinguishable markers for each dialectal group. The Arabic spoken in Saudi Arabia, as a subset of the Arabian Peninsular variety, has been classified into five variants, namely, Najdi dialect (spoken in central Saudi Arabia), the northern dialects (northern Saudi Arabia), Hijazi dialect (western Saudi Arabia), the eastern dialects/Gulf Arabic (eastern Saudi Arabia), and the southern dialects (southern Saudi Arabia) ( Al-Twairesh et al., 2018 ; Bayazed et al., 2020 ).…”
Section: Introductionmentioning
confidence: 99%
“…Within these varieties, abundant accents and linguistic peculiarities serve as distinguishable markers for each dialectal group. The Arabic spoken in Saudi Arabia, as a subset of the Arabian Peninsular variety, has been classified into five variants, namely, Najdi dialect (spoken in central Saudi Arabia), the northern dialects (northern Saudi Arabia), Hijazi dialect (western Saudi Arabia), the eastern dialects/Gulf Arabic (eastern Saudi Arabia), and the southern dialects (southern Saudi Arabia) ( Al-Twairesh et al., 2018 ; Bayazed et al., 2020 ).…”
Section: Introductionmentioning
confidence: 99%
“…Regional Varieties: Among the 18 dialects classes, the KSA class has the largest variety of dialects due to the geographical diversity and historical migration of people from different linguistic backgrounds. Thus, the East region of KSA tends to share a lot of linguistic similarities with Egypt, while the Southern region share similarities with Yemen, the Northern region is similar to the Levantine dialect (this includes: Syria, Jordan, Palestine, and Lebanon), and the Middle and Western regions congruent with rest of Gulf countries (Bayazed et al, 2020). Also, according to (Alruily, 2020), the majority of most active twitter users are from KSA.…”
Section: Error Analysis and Discussionmentioning
confidence: 99%
“…Similarly, general guidelines However, there is a need for a well-defined crowdsourcing platform that engages the deaf community, language experts, and technologists to build a multi-purpose sign language dictionary that can be used for building specialized translation tools for the deaf community. Similarly, different successful usages of crowdsourcing in relevant projects like the Bentham project [37], workforceefficient consensus for bio-collections information [38], use of crowdsourcing in general [39], and application of crowdsourcing in corpus management in natural language processing [40,45] has also been presented in the literature.…”
Section: Use Of Crowdsourcing For Natural Language Processing Tasksmentioning
confidence: 99%