A simple differentiable programming language

Abadi, Martı́n; Plotkin, Gordon

doi:10.1145/3371106

Cited by 52 publications

(101 citation statements)

References 31 publications

(27 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…After the advances in deep learning of the last years, this is no longer the case: neural network architectures are now łdynamicž, in the sense that the input may influence the shape of the net, and expressing such architectures requires resorting a priori to the full power of a modern programming language, yielding what some have called differentiable programming [LeCun 2018]. This evolution of deep learning spurred the rapid development of differentiable programming frameworks [Abadi et al 2016;Paszke et al 2017] and, at the same time, received much attention in programming languages (PL) research for establishing its theoretical foundations [Abadi and Plotkin 2020;Brunel et al 2020;Elliott 2018;Huot et al 2020;Shaikhha et al 2019;Wang et al 2019].…”

Section: Introductionmentioning

confidence: 99%

“…The theory of AD transformations has by now been developed to considerable depth by several authors: Pearlmutter and Siskind first pointed out that reverse mode AD, commonly known as backpropagation, may be naturally expressed in terms of higher-order programs, and used this idea to develop a differentiable variant of Scheme [Pearlmutter and Siskind 2008]; more recently, Elliott emphasized functoriality as a systematic way of understanding the modular nature of AD transformations [Elliott 2018]; the work [Wang et al 2019] introduced Lantern, a fully general differentiable programming framework in which the notion of delimited continuation is used to correctly handle memory updates during backpropagation; finally, Brunel, Mazza and Pagani showed that the continuation-passing machinery at work in [Wang et al 2019] (and, implicitly, in [Pearlmutter and Siskind 2008]) may be understood in terms of linear negation (in the sense of Girard's linear logic), giving a purely functional transformation for reverse mode AD and a conceptually clean analysis of its efficiency in terms of a łlinear factoringž evaluation rule [Brunel et al 2020]. On the semantics side, Abadi and Plotkin studied denotational semantics for a first order differentiable language [Abadi and Plotkin 2020] and Huot, Staton and Vákár gave a uniform approach to proving soundness of AD (forward and reverse) for simply-typed programs based on a diffeology semantics [Huot et al 2020].…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Automatic differentiation in PCF

Mazza

Pagani

2021

Proc. ACM Program. Lang.

View full text Add to dashboard Cite

We study the correctness of automatic differentiation (AD) in the context of a higher-order, Turing-complete language (PCF with real numbers), both in forward and reverse mode. Our main result is that, under mild hypotheses on the primitive functions included in the language, AD is almost everywhere correct, that is, it computes the derivative or gradient of the program under consideration except for a set of Lebesgue measure zero. Stated otherwise, there are inputs on which AD is incorrect, but the probability of randomly choosing one such input is zero. Our result is in fact more precise, in that the set of failure points admits a more explicit description: for example, in case the primitive functions are just constants, addition and multiplication, the set of points where AD fails is contained in a countable union of zero sets of polynomials.

show abstract

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Automatic differentiation in PCF

Mazza

Pagani

2021

Proc. ACM Program. Lang.

View full text Add to dashboard Cite

show abstract

“…Roughly speaking, AD acts on the code of a program by letting variables incorporate values for their derivative, and operators propagate derivatives according to the chain rule of differential calculus [52]. Due to its vast applications in machine learning (backpropagation [49] being an example of an AD technique) and, most notably, in deep learning [9], AD is rapidly becoming a topic of interest in the programming language theory community, as witnessed by the new line of research called differentiable programming (see, e.g., [28,50,16,1] for some recent results on AD and programming language theory developed in the latter field).…”

Section: Automatic Differentiationmentioning

confidence: 99%

“…Algorithms for automatic differentiation have recently been extended to higherorder programming languages [50,46,51,42,45], and have been investigated from a semantical perspective in [16,1] relying on insights from linear logic and denotational semantics. In particular, the work of Huot et al [37] provides a denotational proof of correctness of the program transformation of [50] that we have studied in Section 5.…”

Section: Related Workmentioning

confidence: 99%

See 1 more Smart Citation

On the Versatility of Open Logical Relations

Barthe

Crubillé

Lago

et al. 2020

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Logical relations are one among the most powerful techniques in the theory of programming languages, and have been used extensively for proving properties of a variety of higher-order calculi. However, there are properties that cannot be immediately proved by means of logical relations, for instance program continuity and differentiability in higher-order languages extended with real-valued functions. Informally, the problem stems from the fact that these properties are naturally expressed on terms of non-ground type (or, equivalently, on open terms of base type), and there is no apparent good definition for a base case (i.e. for closed terms of ground types). To overcome this issue, we study a generalization of the concept of a logical relation, called open logical relation, and prove that it can be fruitfully applied in several contexts in which the property of interest is about expressions of first-order type. Our setting is a simply-typed λ-calculus enriched with real numbers and real-valued first-order functions from a given set, such as the one of continuous or differentiable functions. We first prove a containment theorem stating that for any collection of real-valued firstorder functions including projection functions and closed under function composition, any well-typed term of first-order type denotes a function belonging to that collection. Then, we show by way of open logical relations the correctness of the core of a recently published algorithm for forward automatic differentiation. Finally, we define a refinement-based type system for local continuity in an extension of our calculus with conditionals, and prove the soundness of the type system using open logical relations.

show abstract

Correctness of Automatic Differentiation via Diffeologies and Categorical Gluing

Huot

Staton

Vákár

2020

Lecture Notes in Computer Science

View full text Add to dashboard Cite

We present semantic correctness proofs of Automatic Differentiation (AD). We consider a forward-mode AD method on a higher order language with algebraic data types, and we characterise it as the unique structure preserving macro given a choice of derivatives for basic operations. We describe a rich semantics for differentiable programming, based on diffeological spaces. We show that it interprets our language, and we phrase what it means for the AD method to be correct with respect to this semantics. We show that our characterisation of AD gives rise to an elegant semantic proof of its correctness based on a gluing construction on diffeological spaces. We explain how this is, in essence, a logical relations argument. Finally, we sketch how the analysis extends to other AD methods by considering a continuation-based method.

show abstract

A simple differentiable programming language

Cited by 52 publications

References 31 publications

Automatic differentiation in PCF

Automatic differentiation in PCF

On the Versatility of Open Logical Relations

Correctness of Automatic Differentiation via Diffeologies and Categorical Gluing

Contact Info

Product

Resources

About