Proceedings of the the 17th International Workshop on Semantic Evaluation (SemEval-2023) 2023
DOI: 10.18653/v1/2023.semeval-1.158
|View full text |Cite
|
Sign up to set email alerts
|

UniBoe’s at SemEval-2023 Task 10: Model-Agnostic Strategies for the Improvement of Hate-Tuned and Generative Models in the Classification of Sexist Posts

Abstract: We present our submission to SemEval-2023 Task 10: Explainable Detection of Online Sexism (EDOS). We address all three tasks: Task A consists of identifying whether a post is sexist. If so, Task B attempts to assign it one of four classes: threats, derogation, animosity, and prejudiced discussions. Task C aims for an even more fine-grained classification, divided among 11 classes. We experiment with finetuning of hate-tuned Transformer-based models and priming for generative models. In addition, we explore mod… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
0
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 25 publications
0
0
0
Order By: Relevance
“…Generating data: With the problem of data scarcity, especially in multilingual settings, some research studies are directed into providing more efficient solutions for data augmentation, by leveraging generated samples in order to gradually train their detection models and enhance the performance of their classification capabilities. Several approaches have been already released on English samples, that may be used to work on generating multilingual data, using different methods like adversarial auto-regressive models ( Ocampo, Cabrio & Villata, 2023 ), generative GPT3 PLM-based models ( Hartvigsen et al, 2022 ), or generative GPT-Neo based model ( Muti, Fernicola & Barrón-Cedeño, 2023 ).…”
Section: Challenges and Limitationsmentioning
confidence: 99%
“…Generating data: With the problem of data scarcity, especially in multilingual settings, some research studies are directed into providing more efficient solutions for data augmentation, by leveraging generated samples in order to gradually train their detection models and enhance the performance of their classification capabilities. Several approaches have been already released on English samples, that may be used to work on generating multilingual data, using different methods like adversarial auto-regressive models ( Ocampo, Cabrio & Villata, 2023 ), generative GPT3 PLM-based models ( Hartvigsen et al, 2022 ), or generative GPT-Neo based model ( Muti, Fernicola & Barrón-Cedeño, 2023 ).…”
Section: Challenges and Limitationsmentioning
confidence: 99%
“…Curation of datasets [5,16,35] can also aid in sexism detection. With advancements made in Deep Learning (DL), especially after the introduction of transformer architecture [34], models like BERT [9] or RoBERTa [19] have become de-facto models that have been applied to detect sexism from text data [13,23,29]. Even though the aforementioned publications use the whole dataset to train and evaluate their models, some researchers [2-4, 12, 22] suggest that some data instances are more useful for driving the learning process and impacting the final model performance than others.…”
Section: Introductionmentioning
confidence: 99%