Proceedings of the 11th Joint Conference on Lexical and Computational Semantics 2022
DOI: 10.18653/v1/2022.starsem-1.23
|View full text |Cite
|
Sign up to set email alerts
|

Pretraining on Interactions for Learning Grounded Affordance Representations

Abstract: Lexical semantics and cognitive science point to affordances (i.e. the actions that objects support) as critical for understanding and representing nouns and verbs. However, study of these semantic features has not yet been integrated with the "foundation" models that currently dominate language representation research. We hypothesize that predictive modeling of object state over time will result in representations that encode object affordance information "for free". We train a neural network to predict objec… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
2
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
7

Relationship

0
7

Authors

Journals

citations
Cited by 8 publications
(10 citation statements)
references
References 34 publications
0
2
0
Order By: Relevance
“…Given the connection between facts learned in pretraining and the MLP layers (Geva et al, 2021;Meng et al, 2023;Merullo et al, 2023), it's possible that tuning attention alone is not enough to see higher performance in this setting.…”
Section: Results Of Interventions On the World Capital Datasetmentioning
confidence: 99%
See 1 more Smart Citation
“…Given the connection between facts learned in pretraining and the MLP layers (Geva et al, 2021;Meng et al, 2023;Merullo et al, 2023), it's possible that tuning attention alone is not enough to see higher performance in this setting.…”
Section: Results Of Interventions On the World Capital Datasetmentioning
confidence: 99%
“…This naturally sets up our study, which also considers attention heads as the source of the competing effect between copying the counterfactual from earlier in context vs. extracting the memorized fact from an earlier subject token. A core technique in these works is projecting activations from model components into the vocabulary space to make claims about their roles, which we generically refer to here as logit attribution (Nostalgebraist, 2020;Wang et al, 2022;Merullo et al, 2023;Belrose et al, 2023;Dar et al, 2022;Millidge and Black, 2022). We leverage this technique to localize attention heads which tend to promote either context or memorized information ( §6).…”
Section: Related Workmentioning
confidence: 99%
“…These were conceptual neurons in which the distinction between image and text tended to be overcome (Goh et al 2021). Multimodality, at the neural level, is really panmodality, suggesting a semantics without clearly differentiated sign systems (this is also suggested by Merullo et al 2022). Dumb meaning finds a new quality here, and is not tied to either text or image data, but encompasses both in a way that points to meaning beyond modal separation -and again has nothing to do with mind (see for more on this Bajohr 2024c).…”
mentioning
confidence: 78%
“…The pragmatic methods in Section 4 are also compatible with LLMs, e.g., Liu et al (2023) combine RSA with meta-learning to apply GPT models in an image reference game setting; FAIR et al (2022) use a large BART model (Lewis et al, 2020) in conjunction with a multi-agent planning procedure in the grounded dialogue game of Diplomacy. As grounded LLM adapters (Alayrac et al, 2022;Merullo et al, 2023;Eichenberg et al, 2022;Koh et al, 2023) continue to improve, we expect to see more work applying LLMs as components of pragmatic models for these grounding tasks.…”
Section: Pragmatic Modeling and Llmsmentioning
confidence: 99%