Proceedings of the 1st Workshop on Multilingual Representation Learning 2021
DOI: 10.18653/v1/2021.mrl-1.1
|View full text |Cite
|
Sign up to set email alerts
|

Language Models are Few-shot Multilingual Learners

Abstract: General-purpose language models have demonstrated impressive capabilities, performing on par with state-of-the-art approaches on a range of downstream natural language processing (NLP) tasks and benchmarks when inferring instructions from very few examples. Here, we evaluate the multilingual skills of the GPT and T5 models in conducting multi-class classification on non-English languages without any parameter updates. We show that, given a few English examples as context, pre-trained language models can predic… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

2
658
0
3

Year Published

2022
2022
2023
2023

Publication Types

Select...
7
2
1

Relationship

0
10

Authors

Journals

citations
Cited by 692 publications
(1,070 citation statements)
references
References 16 publications
2
658
0
3
Order By: Relevance
“…For example, BERT [2] is a pre-trained transformer-based encoder model that can be fine-tuned on various NLP tasks, such as sentence classification, question answering and named entity recognition. In fact, the so-called few-shot learning capability of large language models to be efficiently adapted to down-stream tasks or even other seemingly unrelated tasks (e.g., as in transfer learning) has been empirically observed and studied for various natural-language tasks [6], e.g., more recently in the context of generating synthetic and yet realistic heterogeneous tabular data [7].…”
Section: Introductionmentioning
confidence: 99%
“…For example, BERT [2] is a pre-trained transformer-based encoder model that can be fine-tuned on various NLP tasks, such as sentence classification, question answering and named entity recognition. In fact, the so-called few-shot learning capability of large language models to be efficiently adapted to down-stream tasks or even other seemingly unrelated tasks (e.g., as in transfer learning) has been empirically observed and studied for various natural-language tasks [6], e.g., more recently in the context of generating synthetic and yet realistic heterogeneous tabular data [7].…”
Section: Introductionmentioning
confidence: 99%
“…Deep learning (DL) models have attained excellent results in a diversity of problems, in classification large-scale images [1], processing of natural language [2], and segmentation of the medical images [3]. However, standard approaches are discovered to produce more predictions than necessary, which means they are not calibrated correctly [4].…”
Section: Introductionmentioning
confidence: 99%
“…There is a problem with reliability and interpretability. 5 Therefore, we should not use the outcomes of such language models in contexts where an incorrect output is ethically questionable. Since outputs can be subtly flawed or untrue, one has to remain generally skeptical.…”
Section: Introductionmentioning
confidence: 99%