Our system is currently under heavy load due to increased usage. We're actively working on upgrades to improve performance. Thank you for your patience.
2021
DOI: 10.48550/arxiv.2110.10819
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Shaking the foundations: delusions in sequence models for interaction and control

Abstract: The recent phenomenal success of language models has reinvigorated machine learning research, and large sequence models such as transformers are being applied to a variety of domains. One important problem class that has remained relatively elusive however is purposeful adaptive behavior. Currently there is a common perception that sequence models "lack the understanding of the cause and effect of their actions" leading them to draw incorrect inferences due to auto-suggestive delusions. In this report we expla… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
2
1
1

Relationship

1
3

Authors

Journals

citations
Cited by 4 publications
(3 citation statements)
references
References 14 publications
(25 reference statements)
0
3
0
Order By: Relevance
“…Generating actions using an autoregressive model can lead to causal "self-delusion" biases when there are confounding variables (Ortega et al, 2021). For example, sampling actions can condition the model to solve the wrong task when multiple tasks share similar observation and actions specifications.…”
Section: Related Workmentioning
confidence: 99%
“…Generating actions using an autoregressive model can lead to causal "self-delusion" biases when there are confounding variables (Ortega et al, 2021). For example, sampling actions can condition the model to solve the wrong task when multiple tasks share similar observation and actions specifications.…”
Section: Related Workmentioning
confidence: 99%
“…The Eliciting Latent Knowledge proposal (ELK, 2022) suggests making latent variables explicit, modelled using a Bayesian network, to improve interpretability and safety for advanced AI systems. Ortega et al (2021) explains a formalism for LM finetuning with causal graphical models in order to extend the pre-dictive capabilities of AI agents towards more adaptive behaviour. They focus on analysing an auto-regressive action (random variable) prediction scheme in the interactive setting of RL where a model is simultaneously a generator and predictor of data.…”
Section: Related Workmentioning
confidence: 99%
“…However, when applying this same model to the MPC process to optimize the control trajectory for specific results we may not want to optimize future control states based on past executed controls. Recent theoretical insights [44] into learning models for control show that conditioning on prior actions can cause "self-delusions," i.e., the model takes its own actions as evidence about the world, thereby slowly corrupting the inference process. To avoid the occurence of such self-delusions, we zero out all weights below the diagonal of the weight matrix w r|u from (15), which represent the delusional effects of future controls on past actions.…”
Section: Model-predictive Control With Interaction Primitivesmentioning
confidence: 99%