2023
DOI: 10.48550/arxiv.2303.13112
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

A Simple Explanation for the Phase Transition in Large Language Models with List Decoding

Abstract: Various recent experimental results show that large language models (LLM) exhibit emergent abilities that are not present in small models. System performance is greatly improved after passing a certain critical threshold of scale. In this letter, we provide a simple explanation for such a phase transition phenomenon. For this, we model an LLM as a sequence-to-sequence random function. Instead of using instant generation at each step, we use a list decoder that keeps a list of candidate sequences at each step a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
0
0

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 10 publications
0
0
0
Order By: Relevance
“…Theoretical analyses of phase transitions in language models: Some theoretical studies have examined phase transitions in mathematical models to explain phenomena such as emergent abilities [Chang, 2023] and grokking [Žunkovič andIlievski, 2022, Rubin et al, 2024], although some of them are not necessarily limited to LLMs. Cui et al [2024] demonstrated that a simple model with a self-attention mechanism exhibits a phase transition between phases corresponding to positional and semantic mechanisms.…”
Section: Related Workmentioning
confidence: 99%
“…Theoretical analyses of phase transitions in language models: Some theoretical studies have examined phase transitions in mathematical models to explain phenomena such as emergent abilities [Chang, 2023] and grokking [Žunkovič andIlievski, 2022, Rubin et al, 2024], although some of them are not necessarily limited to LLMs. Cui et al [2024] demonstrated that a simple model with a self-attention mechanism exhibits a phase transition between phases corresponding to positional and semantic mechanisms.…”
Section: Related Workmentioning
confidence: 99%