2019
DOI: 10.1145/3371106
|View full text |Cite
|
Sign up to set email alerts
|

A simple differentiable programming language

Abstract: Automatic differentiation plays a prominent role in scientific computing and in modern machine learning, often in the context of powerful programming systems. The relation of the various embodiments of automatic differentiation to the mathematical notion of derivative is not always entirely clear-discrepancies can arise, sometimes inadvertently. In order to study automatic differentiation in such programming contexts, we define a small but expressive programming language that includes a construct for reverse-m… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

1
92
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 52 publications
(101 citation statements)
references
References 31 publications
(27 reference statements)
1
92
0
Order By: Relevance
“…After the advances in deep learning of the last years, this is no longer the case: neural network architectures are now łdynamicž, in the sense that the input may influence the shape of the net, and expressing such architectures requires resorting a priori to the full power of a modern programming language, yielding what some have called differentiable programming [LeCun 2018]. This evolution of deep learning spurred the rapid development of differentiable programming frameworks [Abadi et al 2016;Paszke et al 2017] and, at the same time, received much attention in programming languages (PL) research for establishing its theoretical foundations [Abadi and Plotkin 2020;Brunel et al 2020;Elliott 2018;Huot et al 2020;Shaikhha et al 2019;Wang et al 2019].…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…After the advances in deep learning of the last years, this is no longer the case: neural network architectures are now łdynamicž, in the sense that the input may influence the shape of the net, and expressing such architectures requires resorting a priori to the full power of a modern programming language, yielding what some have called differentiable programming [LeCun 2018]. This evolution of deep learning spurred the rapid development of differentiable programming frameworks [Abadi et al 2016;Paszke et al 2017] and, at the same time, received much attention in programming languages (PL) research for establishing its theoretical foundations [Abadi and Plotkin 2020;Brunel et al 2020;Elliott 2018;Huot et al 2020;Shaikhha et al 2019;Wang et al 2019].…”
Section: Introductionmentioning
confidence: 99%
“…The theory of AD transformations has by now been developed to considerable depth by several authors: Pearlmutter and Siskind first pointed out that reverse mode AD, commonly known as backpropagation, may be naturally expressed in terms of higher-order programs, and used this idea to develop a differentiable variant of Scheme [Pearlmutter and Siskind 2008]; more recently, Elliott emphasized functoriality as a systematic way of understanding the modular nature of AD transformations [Elliott 2018]; the work [Wang et al 2019] introduced Lantern, a fully general differentiable programming framework in which the notion of delimited continuation is used to correctly handle memory updates during backpropagation; finally, Brunel, Mazza and Pagani showed that the continuation-passing machinery at work in [Wang et al 2019] (and, implicitly, in [Pearlmutter and Siskind 2008]) may be understood in terms of linear negation (in the sense of Girard's linear logic), giving a purely functional transformation for reverse mode AD and a conceptually clean analysis of its efficiency in terms of a łlinear factoringž evaluation rule [Brunel et al 2020]. On the semantics side, Abadi and Plotkin studied denotational semantics for a first order differentiable language [Abadi and Plotkin 2020] and Huot, Staton and Vákár gave a uniform approach to proving soundness of AD (forward and reverse) for simply-typed programs based on a diffeology semantics [Huot et al 2020].…”
Section: Introductionmentioning
confidence: 99%
“…Roughly speaking, AD acts on the code of a program by letting variables incorporate values for their derivative, and operators propagate derivatives according to the chain rule of differential calculus [52]. Due to its vast applications in machine learning (backpropagation [49] being an example of an AD technique) and, most notably, in deep learning [9], AD is rapidly becoming a topic of interest in the programming language theory community, as witnessed by the new line of research called differentiable programming (see, e.g., [28,50,16,1] for some recent results on AD and programming language theory developed in the latter field).…”
Section: Automatic Differentiationmentioning
confidence: 99%
“…Algorithms for automatic differentiation have recently been extended to higherorder programming languages [50,46,51,42,45], and have been investigated from a semantical perspective in [16,1] relying on insights from linear logic and denotational semantics. In particular, the work of Huot et al [37] provides a denotational proof of correctness of the program transformation of [50] that we have studied in Section 5.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation