Proceedings of the 1st Workshop on Multilingual Representation Learning 2021
DOI: 10.18653/v1/2021.mrl-1.17
|View full text |Cite
|
Sign up to set email alerts
|

Shaking Syntactic Trees on the Sesame Street: Multilingual Probing with Controllable Perturbations

Abstract: Recent research has adopted a new experimental field centered around the concept of text perturbations which has revealed that shuffled word order has little to no impact on the downstream performance of Transformer-based language models across many NLP tasks. These findings contradict the common understanding of how the models encode hierarchical and structural information and even question if the word order is modeled with position embeddings. To this end, this paper proposes nine probing datasets organized … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
4
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(3 citation statements)
references
References 54 publications
(32 reference statements)
0
3
0
Order By: Relevance
“…There have been several different approaches to quantifying the linguistic information that is learned by multilingual models. One direction has performed layer-wise analyses to quantify what information is stored at different layers in the model (de Vries et al, 2020;Taktasheva et al, 2021;Papadimitriou et al, 2021). Others have examined the extent to which the different training languages are captured by the model, finding that some languages suffer in the multilingual setting despite overall good performance from the models (Conneau et al, 2020a;.…”
Section: Related Work Linguistic Knowledge In Multilingual Modelsmentioning
confidence: 99%
“…There have been several different approaches to quantifying the linguistic information that is learned by multilingual models. One direction has performed layer-wise analyses to quantify what information is stored at different layers in the model (de Vries et al, 2020;Taktasheva et al, 2021;Papadimitriou et al, 2021). Others have examined the extent to which the different training languages are captured by the model, finding that some languages suffer in the multilingual setting despite overall good performance from the models (Conneau et al, 2020a;.…”
Section: Related Work Linguistic Knowledge In Multilingual Modelsmentioning
confidence: 99%
“…This work follows the same experimental direction where text perturbations serve to explore the sensitivity of language models to specific phenomena (Futrell et al, 2019;Ettinger, 2020;Taktasheva et al, 2021;Dankers et al, 2021). It has been shown, for example, that shuffling word order causes significant performance drops on a wide range of QA tasks (Si et al, 2019;Sugawara et al, 2019), but that state-of-the-art NLU models are not sensitive to word order (Pham et al, 2020;.…”
Section: Related Workmentioning
confidence: 99%
“…The existing work devoted to the cross-lingual probing showed that grammatical knowledge of Transformer LMs is adapted to the downstream language; in the case of Russian, the interpretation of results cannot be easily explained (Ravishankar et al, 2019). However, LMs are more insensitive towards granular perturbations when processing texts in languages with free word order, such as Russian (Taktasheva et al, 2021).…”
Section: Introductionmentioning
confidence: 99%