Probing Multilingual BERT for Genetic and Typological Signals

Rama, Taraka; Beinborn, Lisa; Eger, Steffen

doi:10.18653/v1/2020.coling-main.105

Cited by 9 publications

(10 citation statements)

References 49 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Though many languages share such features as a result of typological relations (which mBERT is known to exploit; see, e.g. Pires et al, 2019;Choenni and Shutova, 2020;Rama et al, 2020), there are also language-specific features to which, we hypothesise, mBERT needs to dedicate a greater share of its representational capacity, compared to the NLI task.…”

Section: Introductionmentioning

confidence: 89%

On the Language-specificity of Multilingual BERT and the Impact of Fine-tuning

Tanti¹,

Plas²,

Borg³

et al. 2021

Proceedings of the Fourth BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP

View full text Add to dashboard Cite

Recent work has shown evidence that the knowledge acquired by multilingual BERT (mBERT) has two components: a languagespecific and a language-neutral one. This paper analyses the relationship between them, in the context of fine-tuning on two tasks -POS tagging and natural language inferencewhich require the model to bring to bear different degrees of language-specific knowledge. Visualisations reveal that mBERT loses the ability to cluster representations by language after fine-tuning, a result that is supported by evidence from language identification experiments. However, further experiments on 'unlearning' language-specific representations using gradient reversal and iterative adversarial learning are shown not to add further improvement to the language-independent component over and above the effect of fine-tuning. The results presented here suggest that the process of fine-tuning causes a reorganisation of the model's limited representational capacity, enhancing language-independent representations at the expense of language-specific ones.

show abstract

Section: Introductionmentioning

confidence: 89%

On the Language-specificity of Multilingual BERT and the Impact of Fine-tuning

Tanti¹,

Plas²,

Borg³

et al. 2021

Proceedings of the Fourth BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP

View full text Add to dashboard Cite

show abstract

Section: Introductionmentioning

confidence: 80%

On the Language-specificity of Multilingual BERT and the Impact of Fine-tuning

Tanti¹,

Plas²,

Borg³

et al. 2021

Preprint

View full text Add to dashboard Cite

Recent work has shown evidence that the knowledge acquired by multilingual BERT (mBERT) has two components: a language-specific and a language-neutral one. This paper analyses the relationship between them, in the context of fine-tuning on two tasks -POS tagging and natural language inference -which require the model to bring to bear different degrees of language-specific knowledge. Visualisations reveal that mBERT loses the ability to cluster representations by language after fine-tuning, a result that is supported by evidence from language identification experiments. However, further experiments on 'unlearning' language-specific representations using gradient reversal and iterative adversarial learning are shown not to add further improvement to the language-independent component over and above the effect of fine-tuning. The results presented here suggest that the process of fine-tuning causes a reorganisation of the model's limited representational capacity, enhancing language-independent representations at the expense of language-specific ones.

show abstract

“…Yu et al (2021) train language embeddings from denoising autoencoders for 29 languages 1 , which is still a small number. Rama et al (2020) analyze language distance based on representations from mBERT and multilingual FastText embeddings (Bojanowski et al, 2017). They do so specifically by taking the averaged pairwise distances between vectors of words from a multilingual word list.…”

Section: Representational Similaritymentioning

confidence: 99%

A study of conceptual language similarity: comparison and evaluation

Liu

Schütze

2023

Preprint

View full text Add to dashboard Cite

An interesting line of research in natural language processing (NLP) aims to incorporate linguistic typology to bridge linguistic diversity and assist the research of low-resource languages. While most works construct linguistic similarity measures based on lexical or typological features, such as word order and verbal inflection, recent work has introduced a novel approach to defining language similarity based on how they represent basic concepts, which is complementary to existing similarity measures. In this work, we study the conceptual similarity in detail and evaluate it extensively on a binary classification task.

show abstract

Probing Multilingual BERT for Genetic and Typological Signals

Cited by 9 publications

References 49 publications

On the Language-specificity of Multilingual BERT and the Impact of Fine-tuning

On the Language-specificity of Multilingual BERT and the Impact of Fine-tuning

On the Language-specificity of Multilingual BERT and the Impact of Fine-tuning

A study of conceptual language similarity: comparison and evaluation

Contact Info

Product

Resources

About