Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing: System Demonstrations 2021
DOI: 10.18653/v1/2021.emnlp-demo.9
|View full text |Cite
|
Sign up to set email alerts
|

KOAS: Korean Text Offensiveness Analysis System

Abstract: Warning: This manuscript contains a certain level of offensive expression.As communication through social media platforms has grown immensely, the increasing prevalence of offensive language online has become a critical problem. Notably in Korea, one of the countries with the highest Internet usage, automatic detection of offensive expressions has recently been brought to attention. However, morphological richness and complex syntax of Korean causes difficulties in neural model training. Furthermore, most of p… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
12
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 16 publications
(21 citation statements)
references
References 13 publications
0
12
0
Order By: Relevance
“…Korean stop words were downloaded from three well-known sources and merged [1]. To find the nouns, the Open Korean Text part-of-speech tagger offered in the Korean NLP in Python module in Python (Park and Cho, 2014) was used. These preprocessed nouns were then converted into single- (unigrams) and double-word (bigrams) combinations for analysis.…”
Section: Methodsmentioning
confidence: 99%
“…Korean stop words were downloaded from three well-known sources and merged [1]. To find the nouns, the Open Korean Text part-of-speech tagger offered in the Korean NLP in Python module in Python (Park and Cho, 2014) was used. These preprocessed nouns were then converted into single- (unigrams) and double-word (bigrams) combinations for analysis.…”
Section: Methodsmentioning
confidence: 99%
“…It was demanding to manually investigate the plethora of text and comments of news articles, so natural language processing (NLP) procedures, including (1) tokenization, (2) stop words, and (3) stemming, were used in this study with the assistance of the Korean natural language processing in the Python (KoNLPy) package, 17,18 Korean natural language processing procedures were performed in a form that allows morphological analysis.…”
Section: Methodsmentioning
confidence: 99%
“…Finally, use thirdparty word segmentation tools to perform word segmentation tasks. Our study choose Jieba [18] for Chinese segmentation, konlpy for Korean [19] and Sudachipy for Japanese [20].…”
Section: B Oov Understandingmentioning
confidence: 99%