Proceedings of the Third BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP 2020
DOI: 10.18653/v1/2020.blackboxnlp-1.5
|View full text |Cite
|
Sign up to set email alerts
|

It’s not Greek to mBERT: Inducing Word-Level Translations from Multilingual BERT

Abstract: Recent works have demonstrated that multilingual BERT (mBERT) learns rich cross-lingual representations, that allow for transfer across languages. We study the word-level translation information embedded in mBERT and present two simple methods that expose remarkable translation capabilities with no finetuning. The results suggest that most of this information is encoded in a non-linear way, while some of it can also be recovered with purely linear tools. As part of our analysis, we test the hypothesis that mBE… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

1
9
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 21 publications
(17 citation statements)
references
References 16 publications
(22 reference statements)
1
9
0
Order By: Relevance
“…Other work has taken this further by focusing on the hypothesis that mBERT encodings contain both a language-specific and a language-neutral component (Libovický et al, 2020). Gonen et al (2020) set out to disentangle both components and find that in 'language identity subspace', t-SNE projections show large improvement in clustering with respect to language. In language-neutral space, semantic representations are largely intact.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Other work has taken this further by focusing on the hypothesis that mBERT encodings contain both a language-specific and a language-neutral component (Libovický et al, 2020). Gonen et al (2020) set out to disentangle both components and find that in 'language identity subspace', t-SNE projections show large improvement in clustering with respect to language. In language-neutral space, semantic representations are largely intact.…”
Section: Related Workmentioning
confidence: 99%
“…These results have motivated researchers to try and disentangle the language-specific and languageneutral components of mBERT (e.g. Libovický et al, 2020;Gonen et al, 2020).…”
Section: Introductionmentioning
confidence: 99%
“…mBERT has been implemented for languages like Bangla, Greek, Danish, Turkish, etc. It contributes to multilingual text classification [25] , [26] , [33] , offensive language detection [18] , Word Sense Disambiguation [53] , Translation Quality Estimation [16] , [22] , etc.…”
Section: Introductionmentioning
confidence: 99%
“…The XLM-RoBERTa is a transformer model created by the coders at Facebook in 2019 [8]. This model is superior to the previous multilingual bidirectional encoder representations from transformers (mBERT) model which is only a multilingual transformer models [9]. The mBERT model is an evolution from the early BERT model which is a transformer model that can identify the attention in a text, thus predicting the following words based on the attention from each word [10].…”
mentioning
confidence: 99%