2020
DOI: 10.1007/978-3-030-63031-7_17
|View full text |Cite
|
Sign up to set email alerts
|

Low-Resource Text Classification via Cross-Lingual Language Model Fine-Tuning

Abstract: Text classification tends to be difficult when data are inadequate considering the amount of manually labeled text corpora. For low-resource agglutinative languages including Uyghur, Kazakh, and Kyrgyz (UKK languages), in which words are manufactured via stems concatenated with several suffixes and stems are used as the representation of text content, this feature allows infinite derivatives vocabulary that leads to high uncertainty of writing forms and huge redundant features. There are major challenges of lo… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
3
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
4
2

Relationship

0
10

Authors

Journals

citations
Cited by 12 publications
(5 citation statements)
references
References 14 publications
(17 reference statements)
0
3
0
Order By: Relevance
“…In recent years, numerous researchers have drawn their attention to contrastive Learning [20][21][22][23][24][25], owing to its extraordinary performance in sentiment analysis [26][27][28]. Many models, underpinned by contrastive learning, have been introduced in natural language processing and computer vision.…”
Section: Contrastive Learningmentioning
confidence: 99%
“…In recent years, numerous researchers have drawn their attention to contrastive Learning [20][21][22][23][24][25], owing to its extraordinary performance in sentiment analysis [26][27][28]. Many models, underpinned by contrastive learning, have been introduced in natural language processing and computer vision.…”
Section: Contrastive Learningmentioning
confidence: 99%
“…We performed sequence tagging on different transformer models: (a) We use the uncased base implementation of BERT and mBERT (Devlin et al, 2018) (b) Distill mBERT (Sanh et al, 2019), (c) XLM-RoBERTa (Conneau et al, 2019) trained using knowledge distillation and (d) Char-BERT (Boukkouri et al, 2020) that employs Character CNN to capture unknown and misspelled words. Motivated by prior works on multi-task learning (Chandu et al, 2018;Li et al, 2020), we also experiment with language-aware modeling. In these experiments, we added a language token either as the input encoding or output prediction.…”
Section: Datasets and Modelsmentioning
confidence: 99%
“…We performed sequence tagging on different transformer models: (a) We use the uncased base implementation of BERT and mBERT (Devlin et al, 2018) (b) Distill mBERT (Sanh et al, 2019), (c) XLM-RoBERTa (Conneau et al, 2019) trained using knowledge distillation and (d) Char-BERT (Boukkouri et al, 2020) that employs Character CNN to capture unknown and misspelled words. Motivated by prior works on multi-task learning (Chandu et al, 2018;Li et al, 2020), we also experiment with language-aware modeling. In these experiments, we added a language token either as the input encoding or output prediction.…”
Section: Datasets and Modelsmentioning
confidence: 99%