Tracr: Compiled Transformers as a Laboratory for Interpretability

Lindner, David; Kramár, János; Matthew, Rahtz,; McGrath, Thomas D.; Mikulik, Vladimir

doi:10.48550/arxiv.2301.05062

Cited by 2 publications

(3 citation statements)

References 11 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

Section: Prior Workmentioning

confidence: 99%

“…In a recent and related work, Lindner et al [2023] suggests using transformer networks as programmable units and introduces a compiler called Tracr which utilizes RASP. However, the expressivity limitations and unclear Turing completeness of the language are discussed in Weiss et al [2021], Merrill et al [2022], Lindner et al [2023]. Our approach, in contrast, demonstrates the potential of transformer networks to serve as universal computers, enabling the implementation of arbitrary nonlinear functions and emulating iterative, non-linear algorithms.…”

Section: Prior Workmentioning

confidence: 99%

See 1 more Smart Citation

Looped Transformers as Programmable Computers

Giannou¹,

Rajput²,

Sohn³

et al. 2023

Preprint

View full text Add to dashboard Cite

We present a framework for using transformer networks as universal computers by programming them with specific weights and placing them in a loop. Our input sequence acts as a punchcard, consisting of instructions and memory for data read/writes. We demonstrate that a constant number of encoder layers can emulate basic computing blocks, including embedding edit operations, non-linear functions, function calls, program counters, and conditional branches. Using these building blocks, we emulate a small instruction-set computer. This allows us to map iterative algorithms to programs that can be executed by a looped, 13-layer transformer. We show how this transformer, instructed by its input, can emulate a basic calculator, a basic linear algebra library, and in-context learning algorithms that employ backpropagation. Our work highlights the versatility of the attention mechanism, and demonstrates that even shallow transformers can execute full-fledged, general-purpose programs.* Equal contribution. The title of this paper was not created by a transformer, but we can't guarantee the same for this footnote.

show abstract

Section: Prior Workmentioning

confidence: 99%

Section: Prior Workmentioning

confidence: 99%

Looped Transformers as Programmable Computers

Giannou¹,

Rajput²,

Sohn³

et al. 2023

Preprint

View full text Add to dashboard Cite

show abstract

“…Interpretation methods have been actively developing recently due to the various real-world applications of neural networks and the need to debug and maintain systems based on them. Especially, the Transformer architecture (Vaswani et al, 2017) demonstrates state-of-the-art performance in natural language processing and other modalities, representing a field for the development of interpretability methods (Elhage et al, 2021;Weiss et al, 2021;Zhou et al, 2023;Lindner et al, 2023). Based on active research on a computational model behind the transformer architecture, recent works propose a way to learn models that are fully interpretable by design (Friedman et al, 2023).…”

Section: Introductionmentioning

confidence: 99%