2021
DOI: 10.48550/arxiv.2103.01931
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Categorical Foundations of Gradient-Based Learning

Abstract: We propose a categorical foundation of gradientbased machine learning algorithms in terms of lenses, parametrised maps, and reverse derivative categories. This foundation provides a powerful explanatory and unifying framework: it encompasses a variety of gradient descent algorithms such as ADAM, AdaGrad, and Nesterov momentum, as well as a variety of loss functions such as as MSE and Softmax cross-entropy, shedding new light on their similarities and differences. Our approach also generalises beyond neural net… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
19
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
3
2
1

Relationship

1
5

Authors

Journals

citations
Cited by 7 publications
(19 citation statements)
references
References 12 publications
0
19
0
Order By: Relevance
“…Differential categories [25] have been introduced to axiomatise the notion of derivative. More recently reverse derivative categories [26] generalised the notion of back-propagation, they have been proposed as a categorical foundation for gradient-based learning [27]. These frameworks all define the derivative of a morphism with respect to its domain.…”
Section: Related Workmentioning
confidence: 99%
“…Differential categories [25] have been introduced to axiomatise the notion of derivative. More recently reverse derivative categories [26] generalised the notion of back-propagation, they have been proposed as a categorical foundation for gradient-based learning [27]. These frameworks all define the derivative of a morphism with respect to its domain.…”
Section: Related Workmentioning
confidence: 99%
“…A special case also appears in the resource theory of contextuality as defined in [1], where one first defines deterministic free processes, and probabilistic (but classical) transformations d → e are then defined as transformations d ⊗ c → e where c is a non-contextual (and thus free) resource. This construction is discussed more generally in [27,38], but we modify it slightly by allowing one to choose a class of objects as "parameters" instead of taking that class to consist of all objects: this modification is important for resource theories as it lets one can control which resources are made freely available.…”
Section: Special Morphisms Get Special Pictures: Identities and Symme...mentioning
confidence: 99%
“…In Appendix C.3 we discuss a variant of a construction on monoidal categories, used in special cases in [35] and discussed in more detail in [27,38], that allows one to declare some resources free and thus enlarge the set of possible resource conversions.…”
Section: Introductionmentioning
confidence: 99%
“…Cockett et al (2019) introduce Cartesian reverse derivative categories in which we can define an operator that shares certain properties with reverse-mode automatic differentiation (RD.1 to RD.7 in Definition 13 of Cockett et al (2019)). Reverse derivative categories are remarkably general: the category of Euclidean spaces and differentiable functions between them is of course a reverse derivative category, as are the categories of polynomials over semirings and the category of Boolean circuits (Wilson and Zanasi, 2021).…”
Section: Introductionmentioning
confidence: 99%
“…Means and standard errors from 100 experiments of 10 polynomials each are shown. The code to run these experiments is on Github at tinyurl.com/ku3pjz56 Cruttwell et al (2021). explore categorical formulations of automatic differentiation.…”
mentioning
confidence: 99%