2021
DOI: 10.14569/ijacsa.2021.0121177
|View full text |Cite
|
Sign up to set email alerts
|

Normalisation of Indonesian-English Code-Mixed Text and its Effect on Emotion Classification

Abstract: Usage of code-mixed text has increased in recent years among Indonesian internet users, who often mix Indonesian-language with English-language text. Normalisation of this code-mixed text into Indonesian needs to be performed to capture the meaning of English parts of the text and process them effectively. We improve a state-of-the-art code-mixed Indonesian-English normalisation system by modifying its pipeline modules. We further analyse the effect of code-mixed normalisation on emotion classification tasks. … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
4
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
5
3

Relationship

1
7

Authors

Journals

citations
Cited by 11 publications
(12 citation statements)
references
References 24 publications
0
4
0
Order By: Relevance
“…In general, DrQA pipeline combines bigram hashing with TF-IDF matching retriever and bidirectional RNN paragraph reader [13]. Some recent methods have utilized BERT for question answering, as a result of the effectiveness of BERT that has been shown in various text processing tasks in previous work [17], [18]. Alzubi et al [8] proposed COBERT, COVID-19 Question Answering System Using BERT.…”
Section: B Machine Reading Task For Question Answeringmentioning
confidence: 99%
“…In general, DrQA pipeline combines bigram hashing with TF-IDF matching retriever and bidirectional RNN paragraph reader [13]. Some recent methods have utilized BERT for question answering, as a result of the effectiveness of BERT that has been shown in various text processing tasks in previous work [17], [18]. Alzubi et al [8] proposed COBERT, COVID-19 Question Answering System Using BERT.…”
Section: B Machine Reading Task For Question Answeringmentioning
confidence: 99%
“…This research only focuses on Indonesian-language tweets. Therefore, it cannot be concluded that the methods used in this research are also effective for tweets in other languages or in code-mixed languages [72,73]. Further studies are needed to test the effectiveness of the methods in other languages.…”
Section: Discussionmentioning
confidence: 92%
“…Besides, exposure to English from social media and school makes Indonesians mix their languages with English ( Rizal & Stymne, 2020 ). As a result, mixing Indonesian, Javanese, and English in daily conversation becomes the most prevalent language combination in Indonesian societyt ( Yulianti et al, 2021 ).…”
Section: Introductionmentioning
confidence: 99%
“…LID is critical for some subsequent natural language processing tasks in code-mixed documents ( Gundapu & Mamidi, 2018 ). Applying LID in the code-mixed text has become a foundation work of various NLP systems, including sentiment analysis ( Ansari & Govilkar, 2018 ; Mahata, Das & Bandyopadhyay, 2021 ), translation ( Barik, Mahendra & Adriani, 2019 ; Mahata et al, 2019 ), and emotion classification ( Yulianti et al, 2021 ). The absence of LID in pre-processing tasks can affect those NLP systems.…”
Section: Introductionmentioning
confidence: 99%