Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics 2019
DOI: 10.18653/v1/p19-1343
|View full text |Cite
|
Sign up to set email alerts
|

Improved Sentiment Detection via Label Transfer from Monolingual to Synthetic Code-Switched Text

Abstract: Multilingual writers and speakers often alternate between two languages in a single discourse, a practice called "code-switching". Existing sentiment detection methods are usually trained on sentiment-labeled monolingual text. Manually labeled code-switched text, especially involving minority languages, is extremely rare. Consequently, the best monolingual methods perform relatively poorly on code-switched text. We present an effective technique for synthesizing labeled code-switched text from labeled monoling… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
3
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
3
3

Relationship

0
6

Authors

Journals

citations
Cited by 7 publications
(4 citation statements)
references
References 29 publications
0
3
0
Order By: Relevance
“…To mitigate this issue, we can apply techniques from image forensics [88,89,90] to detect the authenticity of an image. Regarding guided instructions, for example, hate speech detection [91,92,93,94] can help to filter out malicious texts and prevent from producing controversial results with ethics concerns.…”
Section: Conclusion and Discussionmentioning
confidence: 99%
“…To mitigate this issue, we can apply techniques from image forensics [88,89,90] to detect the authenticity of an image. Regarding guided instructions, for example, hate speech detection [91,92,93,94] can help to filter out malicious texts and prevent from producing controversial results with ethics concerns.…”
Section: Conclusion and Discussionmentioning
confidence: 99%
“…Synthetic data construction is a popular direction in NLP community, which effectively helps relieve the data annotation issues, such as data scarcity [57], label imbalance [6] and cross-lingual data [25].…”
Section: Synthetic Corpus Constructionmentioning
confidence: 99%
“…Aguilar and Solorio (2020) augment morphological clues to language models and uses them for transfer learning from English to code-switched data with labels. Samanta et al (2019) uses translation API to create synthetic code-switched text from English datasets and use this for transfer learning from English to code-switched text without labels in the codeswitched case. Qin et al (2020) use synthetically generated code-switched data to enhance zero-shot cross-lingual transfer learning.…”
Section: Related Workmentioning
confidence: 99%