SpaML: a Bimodal Ensemble Learning Spam Detector based on NLP Techniques

Fattahi, Jaouhar; Mejri, Mohamed

doi:10.1109/csp51677.2021.9357595

Cited by 14 publications

(10 citation statements)

References 20 publications

(17 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Ensemble learning Ensembling different models has previously been found useful for text classification (Nozza et al, 2016;Kanakaraj and Guddeti, 2015;Fattahi and Mejri, 2021). Accordingly, ensembling was one of the most common strategies for improving on baseline PCL detection methods.…”

Section: Resultsmentioning

confidence: 99%

SemEval-2022 Task 4: Patronizing and Condescending Language Detection

Perez-Almendros¹,

Espinosa-Anke²,

Schockaert³

2022

Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022)

View full text Add to dashboard Cite

This paper presents an overview of Task 4 at SemEval-2022, which was focused on detecting Patronizing and Condescending Language (PCL) towards vulnerable communities. Two sub-tasks were considered: a binary classification task, where participants needed to classify a given paragraph as containing PCL or not, and a multi-label classification task, where participants needed to identify which types of PCL are present (if any). The task attracted 77 teams. We provide an overview of how the task was organized, discuss the techniques that were employed by the different participants, and summarize the main resulting insights about PCL detection and categorization.

show abstract

Section: Resultsmentioning

confidence: 99%

SemEval-2022 Task 4: Patronizing and Condescending Language Detection

Perez-Almendros¹,

Espinosa-Anke²,

Schockaert³

2022

Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022)

View full text Add to dashboard Cite

show abstract

“…Nagwani and Sharaff proposed the use of ML algorithms such as Naïve Bayes (NB), support vector machine (SVM), non-negative matrix factorization, and latent Dirichlet allocation to identify spam [40], while Almeida et al suggested text normalization [41]. Fattahi and Mejri applied natural language processing (NLP) techniques, namely, Bag of Words (BoW) and Term Frequency-Inverse Document Frequency (TF-IDF) to identify spam SMSs [42]. Choudhary and Jain applied random forest (RF) classification algorithms [43].…”

Section: Methodsmentioning

confidence: 99%

Detecting Illegal Online Gambling (IOG) Services in the Mobile Environment

Min

Lee

2022

Security and Communication Networks

View full text Add to dashboard Cite

Despite the extensive ramifications of illegal online gambling (IOG) services, actions taken by government authorities have had little effect in halting these operations. In order to reduce the prevalence of IOG, the ability to detect malicious uniform resource locators (URLs) is crucial. Text mining and binary classification have been widely adopted to detect and prevent spam short message services (SMSs), but government authorities and various task forces that monitor and regulate gambling also rely on the analysis of malicious URLs. This study proposes a novel system to analyse the characteristics of spam URLs, offering a method that can assist government agencies combatting mobile IOG sites.

show abstract

“…The Fattahi & Mejri (2020) examined the Bag of Words (BoW) and TF-IDF spam detection algorithms using text data containing 747 spam message instances. They used a variety of machine learning approaches to classify spam and were able to achieve an accuracy of 97.99% and precision of 98.97%.…”

Section: Feature-extraction Techniquesmentioning

confidence: 99%

A systematic literature review on spam content detection and classification

Kaddoura

Chandrasekaran²,

Popescu

et al. 2022

PeerJ Computer Science

View full text Add to dashboard Cite

The presence of spam content in social media is tremendously increasing, and therefore the detection of spam has become vital. The spam contents increase as people extensively use social media, i.e., Facebook, Twitter, YouTube, and E-mail. The time spent by people using social media is overgrowing, especially in the time of the pandemic. Users get a lot of text messages through social media, and they cannot recognize the spam content in these messages. Spam messages contain malicious links, apps, fake accounts, fake news, reviews, rumors, etc. To improve social media security, the detection and control of spam text are essential. This paper presents a detailed survey on the latest developments in spam text detection and classification in social media. The various techniques involved in spam detection and classification involving Machine Learning, Deep Learning, and text-based approaches are discussed in this paper. We also present the challenges encountered in the identification of spam with its control mechanisms and datasets used in existing works involving spam detection.

show abstract

SpaML: a Bimodal Ensemble Learning Spam Detector based on NLP Techniques

Cited by 14 publications

References 20 publications

SemEval-2022 Task 4: Patronizing and Condescending Language Detection

SemEval-2022 Task 4: Patronizing and Condescending Language Detection

Detecting Illegal Online Gambling (IOG) Services in the Mobile Environment

A systematic literature review on spam content detection and classification

Contact Info

Product

Resources

About