CS-UM6P at SemEval-2021 Task 1: A Deep Learning Model-based Pre-trained Transformer Encoder for Lexical Complexity

Mamoun, Nabil El; Mahdaouy, Abdelkader El; Mekki, Abdellah El; Essefar, Kabil; Berrada, Ismaïl

doi:10.18653/v1/2021.semeval-1.73

Cited by 5 publications

(4 citation statements)

References 13 publications

(8 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…For sarcasm detection, several research studies have been introduced based on fine-tuning the existing PLMs for English and Arabic languages (Ghanem et al, 2019;Ghosh et al, 2020;Abu Farha et al, 2021). El Mahdaouy et al (2021) have shown that incorporating attention layers on top of the contextualized word embedding of the PLM improves the performance of multi-task and single-task learning models for both sarcasm detection and sentiment analysis in Arabic. The main idea consists of classifying the input text based on the concatenation of the PLM's pooled output and the output of the attention layer.…”

Section: Related Workmentioning

confidence: 99%

“…The main idea consists of classifying the input text based on the concatenation of the PLM's pooled output and the output of the attention layer. This Architecture has yielded promising results on other tasks such as detecting and rating humor, lexical complexity prediction, and fine-grained Arabic dialect identification (Essefar et al, 2021;El Mamoun et al, 2021;El Mekki et al, 2021b).…”

Section: Related Workmentioning

confidence: 99%

See 1 more Smart Citation

CS-UM6P at SemEval-2022 Task 6: Transformer-based Models for Intended Sarcasm Detection in English and Arabic

Mahdaouy¹,

Mekki²,

Essefar³

et al. 2022

Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022)

Self Cite

View full text Add to dashboard Cite

Sarcasm is a form of figurative language where the intended meaning of a sentence differs from its literal meaning. This poses a serious challenge to several Natural Language Processing (NLP) applications such as Sentiment Analysis, Opinion Mining, and Author Profiling. In this paper, we present our participating system to the intended sarcasm detection task in English and Arabic languages. Our system 1 consists of three deep learning-based models leveraging two existing pre-trained language models for Arabic and English. We have participated in all sub-tasks. Our official submissions achieve the best performance on sub-task A for Arabic language and rank second in sub-task B. For sub-task C, our system is ranked 7th and 11th on Arabic and English datasets, respectively.

show abstract

Section: Related Workmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

CS-UM6P at SemEval-2022 Task 6: Transformer-based Models for Intended Sarcasm Detection in English and Arabic

Mahdaouy¹,

Mekki²,

Essefar³

et al. 2022

Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022)

Self Cite

View full text Add to dashboard Cite

show abstract

“…El Mamoun et al [17] introduced a new deep learning-based system for this challenging task. The proposed system consisted of a deep learning model based on a pre-trained transformer encoder for word and Multi-Word Expression (MWE) complexity prediction.…”

Section: Related Workmentioning

confidence: 99%

Combining Transformer Embeddings with Linguistic Features for Complex Word Identification

2022

View full text Add to dashboard Cite

Identifying which words present in a text may be difficult to understand by common readers is a well-known subtask in text complexity analysis. The advent of deep language models has also established the new state-of-the-art in this task by means of end-to-end semi-supervised (pre-trained) and downstream training of, mainly, transformer-based neural networks. Nevertheless, the usefulness of traditional linguistic features in combination with neural encodings is worth exploring, as the computational cost needed for training and running such networks is becoming more and more relevant with energy-saving constraints. This study explores lexical complexity prediction (LCP) by combining pre-trained and adjusted transformer networks with different types of traditional linguistic features. We apply these features over classical machine learning classifiers. Our best results are obtained by applying Support Vector Machines on an English corpus in an LCP task solved as a regression problem. The results show that linguistic features can be useful in LCP tasks and may improve the performance of deep learning systems.

show abstract

“…Each classification layer is feed with the concatenation of the task attention output and the [CLS] token embedding. This model has shown effective performances in many NLP tasks, including dialect identification, sentiment analysis and sarcasm detection for the Arabic language [7,8], humor detection and rating, as well as lexical complexity prediction in English [17,18]. The single-task counterpart of MT_ATT is denoted by ST_ATT.…”

Section: Deep Learning Modelsmentioning

confidence: 99%

Deep Multi-Task Models for Misogyny Identification and Categorization on Arabic Social Media

Mahdaouy¹,

Mekki²,

Ahmed³

et al. 2022

Preprint

Self Cite

View full text Add to dashboard Cite

The prevalence of toxic content on social media platforms, such as hate speech, offensive language, and misogyny, presents serious challenges to our interconnected society. These challenging issues have attracted widespread attention in Natural Language Processing (NLP) community. In this paper, we present the submitted systems to the first Arabic Misogyny Identification shared task. We investigate three multi-task learning models as well as their single-task counterparts. In order to encode the input text, our models rely on the pre-trained MARBERT language model. The overall obtained results show that all our submitted models have achieved the best performances (top three ranked submissions) in both misogyny identification and categorization tasks.

show abstract

CS-UM6P at SemEval-2021 Task 1: A Deep Learning Model-based Pre-trained Transformer Encoder for Lexical Complexity

Cited by 5 publications

References 13 publications

CS-UM6P at SemEval-2022 Task 6: Transformer-based Models for Intended Sarcasm Detection in English and Arabic

CS-UM6P at SemEval-2022 Task 6: Transformer-based Models for Intended Sarcasm Detection in English and Arabic

Combining Transformer Embeddings with Linguistic Features for Complex Word Identification

Deep Multi-Task Models for Misogyny Identification and Categorization on Arabic Social Media

Contact Info

Product

Resources

About