2022
DOI: 10.48550/arxiv.2207.10342
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Language Model Cascades

Abstract: Prompted models have demonstrated impressive few-shot learning abilities. Repeated interactions at test-time with a single model, or the composition of multiple models together, further expands capabilities. These compositions are probabilistic models, and may be expressed in the language of graphical models with random variables whose values are complex data types such as strings. Cases with control flow and dynamic structure require techniques from probabilistic programming, which allow implementing disparat… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

1
7
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
2
2
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(8 citation statements)
references
References 6 publications
1
7
0
Order By: Relevance
“…The rationale is not a separate output that may or may not be consistent with the answer, it is the only part of the passage available to the answer-generation step. This is an example of general architecture that has been referred to as a language model cascade (Dohan et al, 2022), a framework that generalizes earlier work on prompt chaining and multi-stage prompting (Liu et al, 2022). Our work shows that such cascades can indeed lead to reliable and useful rationales.…”
Section: Related Worksupporting
confidence: 57%
“…The rationale is not a separate output that may or may not be consistent with the answer, it is the only part of the passage available to the answer-generation step. This is an example of general architecture that has been referred to as a language model cascade (Dohan et al, 2022), a framework that generalizes earlier work on prompt chaining and multi-stage prompting (Liu et al, 2022). Our work shows that such cascades can indeed lead to reliable and useful rationales.…”
Section: Related Worksupporting
confidence: 57%
“…To our knowledge, the idea of integrating language models as primitives into a probabilistic programming system was first proposed by Lew et al [2020], who showed that in certain verbal reasoning tasks, the posteriors of such programs were better models of human behavior than unconstrained language models. More recently, Dohan et al [2022] proposed unifying various approaches to "chaining" LLMs by understanding them as graphical models or probabilistic programs with string-valued random variables. But in the "chainof-thought"-style applications they explore, there are typically no unknown variables with non-trivial likelihood terms, so no inference algorithm is required-"forward" or "ancestral" sampling suffices.…”
Section: Related Work and Discussionmentioning
confidence: 99%
“…to enable conversational interfaces without re-evaluating the entire conversation history with each new message. But we have found that extending it to the multi-particle setting makes inference in language model probabilistic programs significantly cheaper, compared to previous approaches to integrating Transformer models into probabilistic programs [Lew et al, 2020, Dohan et al, 2022, Zhi-Xuan, 2022].…”
mentioning
confidence: 82%
See 1 more Smart Citation
“…More recent approaches have proposed probabilistic inference approaches for tackling true/false question answering and commonsense question answering (Jung et al, 2022;Liu et al, 2022a). Xie et al (2021) presents a Bayesian inference perspective on incontext learning, and Dohan et al (2022) formalizes and unifies existing prompting techniques in a probabilistic framework. Our work generalizes such approaches to perform arbitrary probabilistic inference outside of the LLM.…”
Section: Related Workmentioning
confidence: 99%