Anais Do XVI Encontro Nacional De Inteligência Artificial E Computacional (ENIAC 2019) 2019
DOI: 10.5753/eniac.2019.9354
|View full text |Cite
|
Sign up to set email alerts
|

Avaliação de técnicas de word embedding na tarefa de detecção de discurso de ódio

Abstract: Este artigo apresenta os resultados obtidos da exploração dos vetores de características gerados de técnicas de word embedding (especificamente word2vec e wang2vec) a partir de um banco de textos na ordem do bilhão de tokens em comparação com os gerados a partir de bancos pequenos na ordem de dezenas de milhar, na aplicação de detecção de discurso de ódio na língua portuguesa. Dando continuidade às pesquisas desenvolvidas por outros autores no Brasil e em Portugal, e aproveitando os recursos e sugestões por el… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
1
0
2

Year Published

2021
2021
2022
2022

Publication Types

Select...
2
2
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(3 citation statements)
references
References 7 publications
(13 reference statements)
0
1
0
2
Order By: Relevance
“…When we started testing our model, we received the following inquiry from a policymaker who was interested in the topic: "What does the system consider to be hate speech?". Despite the performance results of our model (BERT-based model trained with the data set published by Fortuna et al ( 2019)) surpass the benchmarks presented by Pari (2019), our challenge was to also be able to interpret the model. We discussed previously that there is not an established definition for hate speech, so our concern was to investigate what our model understands as hate speech.…”
Section: Model Testing: Explainabilitymentioning
confidence: 73%
“…When we started testing our model, we received the following inquiry from a policymaker who was interested in the topic: "What does the system consider to be hate speech?". Despite the performance results of our model (BERT-based model trained with the data set published by Fortuna et al ( 2019)) surpass the benchmarks presented by Pari (2019), our challenge was to also be able to interpret the model. We discussed previously that there is not an established definition for hate speech, so our concern was to investigate what our model understands as hate speech.…”
Section: Model Testing: Explainabilitymentioning
confidence: 73%
“…Nos últimos anos, esforc ¸os têm sido investidos em diversas tarefas da AS no Português, com o intuito de fornecer recursos para o desenvolvimento de pesquisas e aplicac ¸ões nesta área. Entre as tarefas tratadas, podemos citar a detecc ¸ão de discurso de ódio [Soto et al 2019, O. Plath et al 2022], detecc ¸ão de ironia e sarcasmo [Schubert andde Freitas 2020, Gonc ¸alves et al 2015] e a principal delas: classificac ¸ão de polaridade.…”
Section: Trabalhos Relacionadosunclassified
“…Esse processo é conhecido como embedding. De acordo com [10], o processo de embbeding é a criac ¸ão de um vetor de números reais que representam as palavras de um dado texto e que contém algum conhecimento de posicionamento entre as palavras.…”
Section: Construc ¸ãO Do Algoritmounclassified