2018
DOI: 10.48550/arxiv.1803.06396
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Reviving and Improving Recurrent Back-Propagation

Renjie Liao,
Yuwen Xiong,
Ethan Fetaya
et al.

Abstract: In this paper, we revisit the recurrent backpropagation (RBP) algorithm (Almeida, 1987;Pineda, 1987), discuss the conditions under which it applies as well as how to satisfy them in deep neural networks. We show that RBP can be unstable and propose two variants based on conjugate gradient on the normal equations (CG-RBP) and Neumann series (Neumann-RBP). We further investigate the relationship between Neumann-RBP and back propagation through time (BPTT) and its truncated version (TBPTT). Our Neumann-RBP has th… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
2
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 21 publications
(30 reference statements)
0
2
0
Order By: Relevance
“…Approximate Implicit Differentiation (AID) [11,14,15,18,22,26,58] and Iterative Differentiation (ITD) [9,10,34,42]. ITD methods first solve the lower level problem approximately and then calculate the hypergradient with backward (forward) automatic differentiation, while AID methods approximate the exact hypergradient [11,19,30,33]. In [12], authors compare these two categories of methods in terms of their hyperiteration complexity.…”
Section: Related Workmentioning
confidence: 99%
“…Approximate Implicit Differentiation (AID) [11,14,15,18,22,26,58] and Iterative Differentiation (ITD) [9,10,34,42]. ITD methods first solve the lower level problem approximately and then calculate the hypergradient with backward (forward) automatic differentiation, while AID methods approximate the exact hypergradient [11,19,30,33]. In [12], authors compare these two categories of methods in terms of their hyperiteration complexity.…”
Section: Related Workmentioning
confidence: 99%
“…An important classical algorithm that is intimately related to EProp is recurrent backprop [8,342,343], where, for a given set of (clamped) inputs and target states, the neural model is trained to settle into a stable activation state where the output units correspond to/align with desired target activity. Recent work has notably revised recurrent backprop, proposing useful modications to improve its performance/effectiveness [245]. One key biological implausibility of recurrent backprop, however, is that, although it considers the same objective that EProp optimizes, its constrained (Lagrangian) formulation of the optimization problem leads to computing a xed point for its second phase using a linearized form the recurrent network itself.…”
mentioning
confidence: 99%