2023
DOI: 10.7717/peerj-cs.1248
|View full text |Cite
|
Sign up to set email alerts
|

How to detect propaganda from social media? Exploitation of semantic and fine-tuned language models

Abstract: Online propaganda is a mechanism to influence the opinions of social media users. It is a growing menace to public health, democratic institutions, and public society. The present study proposes a propaganda detection framework as a binary classification model based on a news repository. Several feature models are explored to develop a robust model such as part-of-speech, LIWC, word uni-gram, Embeddings from Language Models (ELMo), FastText, word2vec, latent semantic analysis (LSA), and char tri-gram feature m… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
2
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
6
1
1

Relationship

2
6

Authors

Journals

citations
Cited by 12 publications
(6 citation statements)
references
References 33 publications
0
2
0
Order By: Relevance
“…The BERT model was introduced by Devlin et al (2018) at Google Lab and it has proven its significance for a variety of text-mining tasks in several application domains ( Malik, Imran & Mamdouh, 2023 ). The benefits of BERT include faster development, automated feature generation, reduced data requirements, and improved performance.…”
Section: Framework Methodologymentioning
confidence: 99%
See 2 more Smart Citations
“…The BERT model was introduced by Devlin et al (2018) at Google Lab and it has proven its significance for a variety of text-mining tasks in several application domains ( Malik, Imran & Mamdouh, 2023 ). The benefits of BERT include faster development, automated feature generation, reduced data requirements, and improved performance.…”
Section: Framework Methodologymentioning
confidence: 99%
“…TF-IDF is a statistical approach to evaluate the significance of a particular word in a large context of the document. This technique is commonly used in NLP and information retrieval (IR) tasks ( Malik, Imran & Mamdouh, 2023 ). It is a weighting technique and the weight of a word in a document is proportional to its frequency of occurrence whereas it is also inversely proportional to its frequency in all documents.…”
Section: Framework Methodologymentioning
confidence: 99%
See 1 more Smart Citation
“…First, traditional NLP models and commonly used LLMs in CSS research often lack reasoning capabilities [116]. For instance, LLMs like BERT-based models, which are extensively used in HCI research that analyze large volumes of social media data [27,60], are typically fine-tuned for specific discrete downstream tasks (e.g., classification). While these pretrained language models have shown promise in performing discrete analyses, some emerging HCI research [116,117] demonstrate the additional value of prompting LLMs to perform multi-step reasoning for a more comprehensive analysis.…”
Section: Harnessing Large Language Models In Computational Social Sci...mentioning
confidence: 99%
“…Word2vec word embedding model has shown state-ofthe-art performance in many classification tasks related to the NLP domain [36][37][38]. There are two methods supported by word2vec to generate word embeddings; skip-gram and CBOW.…”
Section: Word2vecmentioning
confidence: 99%