Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume 2021
DOI: 10.18653/v1/2021.eacl-main.295
|View full text |Cite
|
Sign up to set email alerts
|

Probing the Probing Paradigm: Does Probing Accuracy Entail Task Relevance?

Abstract: Although neural models have achieved impressive results on several NLP benchmarks, little is understood about the mechanisms they use to perform language tasks. Thus, much recent attention has been devoted to analyzing the sentence representations learned by neural encoders, through the lens of 'probing' tasks. However, to what extent was the information encoded in sentence representations, as discovered through a probe, actually used by the model to perform its task? In this work, we examine this probing para… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
37
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
4

Relationship

1
8

Authors

Journals

citations
Cited by 52 publications
(53 citation statements)
references
References 47 publications
0
37
0
Order By: Relevance
“…Ethayarajh, 2019;Mimno and Thompson, 2017), including the recently proposed DIRECTPROBE (Zhou and Srikumar, 2021), which we use in this work. Another line of probing work is to design control tasks (Ravichander et al, 2021;Lan et al, 2020) to reverse-engineer the internal mechanisms of representations (Kovaleva et al, 2019;. However, in contrast to our work, most studies focused on the pre-trained representations, not the fine-tuned ones.…”
Section: Related Workmentioning
confidence: 82%
“…Ethayarajh, 2019;Mimno and Thompson, 2017), including the recently proposed DIRECTPROBE (Zhou and Srikumar, 2021), which we use in this work. Another line of probing work is to design control tasks (Ravichander et al, 2021;Lan et al, 2020) to reverse-engineer the internal mechanisms of representations (Kovaleva et al, 2019;. However, in contrast to our work, most studies focused on the pre-trained representations, not the fine-tuned ones.…”
Section: Related Workmentioning
confidence: 82%
“…Validity measures how well the test measures what it intends to measure. In a valid test, the result is right for the right reasons (McCoy et al, 2019;Ravichander et al, 2021). Robustness measures how well the results of a test can generalize from the experimental setting to realworld settings (Xing et al, 2020;Niu et al, 2020).…”
Section: Probing Methodsmentioning
confidence: 99%
“…This has been a subject of recent criticism for probing methods, on the basis of "correlation does not equal causation", where although probing methods infer that the model represents some concept, no guarantee is given on whether the model actually uses this concept to make its decisions [39,96,112]. This has led to the development of causally-informed class of methods [32,38,115] that do provide a stronger guarantee that causality is correctly attributed, e.g., by showing that the model indeed changes its decision if it ceases to recognize the concept through deriving a counterfactual [28,32].…”
Section: Concept Attributionmentioning
confidence: 99%