2020
DOI: 10.1109/access.2020.3028131
|View full text |Cite
|
Sign up to set email alerts
|

ProSOUL: A Framework to Identify Propaganda From Online Urdu Content

Abstract: Today, the rapid dissemination of information on digital platforms has seen the emergence of information pollution such as misinformation, disinformation, fake news, and different types of propaganda. Information pollution has become a serious threat to the online digital world and has posed several challenges to social media platforms and governments around the world. In this paper, we propose Propaganda Spotting in Online Urdu Language (ProSOUL)-a framework to identify content and sources of propaganda sprea… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
10
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
7
2
1

Relationship

1
9

Authors

Journals

citations
Cited by 26 publications
(13 citation statements)
references
References 33 publications
0
10
0
Order By: Relevance
“…We can look at the features from following points of view: Being imitated: Some features are difficult to be mimicked by malicious users (topographical features), while most of them can be imitated easily. Sensitive to time: For instance, [37] shows that Word n-gram features may provide information which is less relevant at that point of time. Required level of computational resources: Some features are readily available, while some others like layer ratio in [38] need processing…”
Section: Proachesmentioning
confidence: 99%
“…We can look at the features from following points of view: Being imitated: Some features are difficult to be mimicked by malicious users (topographical features), while most of them can be imitated easily. Sensitive to time: For instance, [37] shows that Word n-gram features may provide information which is less relevant at that point of time. Required level of computational resources: Some features are readily available, while some others like layer ratio in [38] need processing…”
Section: Proachesmentioning
confidence: 99%
“…UrduWeb20 is effectively used to develop and test NLP and IR applications for the Urdu language. For instance, UrduWeb20 is employed by Kausar et al [61] for the propaganda detection from the Urdu content. Authors train machine learning models on the gold standard dataset of Urdu content.…”
Section: B Nlp/ir Applicationsmentioning
confidence: 99%
“…These dense vector representations have been leveraged extensively, for example, as input representations in neural network architectures for NLP tasks [10], e.g., detecting 'fake news' and phenomena related to the setting of this work [29]. In a recent study identifying online propaganda [18], Word2vec embeddings were found to outperform a multilingual version of BERT in Urdu [7], which the authors ascribe to the limited vocabulary of Urdu in the model. In another study, Word2vec has been leveraged as a feature in the detection of fake news where researchers found that it performs well in comparison to other textual features across multiple datasets and languages [9].…”
Section: Deriving Insights From Twitter Datamentioning
confidence: 99%