2016
DOI: 10.48550/arxiv.1612.05231
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Tunable Efficient Unitary Neural Networks (EUNN) and their application to RNNs

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
29
0

Year Published

2018
2018
2020
2020

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 36 publications
(30 citation statements)
references
References 10 publications
0
29
0
Order By: Relevance
“…A unitary matrix features uniform eigenvalues and reversibility. It is widely used as an approach to ease the gradient exploding and vanishing problem Arjovsky et al (2015); Wisdom et al (2016); Jing et al (2016) and the memory wall problem Luo et al (2019). One of the simplest ways to parametrize a unitary matrix is representing a unitary matrix as a product of two-level unitary operations Jing et al (2016).…”
Section: Unitary Matricesmentioning
confidence: 99%
See 1 more Smart Citation
“…A unitary matrix features uniform eigenvalues and reversibility. It is widely used as an approach to ease the gradient exploding and vanishing problem Arjovsky et al (2015); Wisdom et al (2016); Jing et al (2016) and the memory wall problem Luo et al (2019). One of the simplest ways to parametrize a unitary matrix is representing a unitary matrix as a product of two-level unitary operations Jing et al (2016).…”
Section: Unitary Matricesmentioning
confidence: 99%
“…It is widely used as an approach to ease the gradient exploding and vanishing problem Arjovsky et al (2015); Wisdom et al (2016); Jing et al (2016) and the memory wall problem Luo et al (2019). One of the simplest ways to parametrize a unitary matrix is representing a unitary matrix as a product of two-level unitary operations Jing et al (2016). A real unitary matrix of size N can be parametrized compactly by N(N − 1)/2 rotation operations LI et al ( 2013)…”
Section: Unitary Matricesmentioning
confidence: 99%
“…IGLOO may be less impacted by permutation than RNN style structures because it is finding a representation for a sequence not by looking at each element sequentially but as a whole, taking patches from the whole input space. (Le et al, 2015) 97.0 82.0 uRNN (Arjovsky et al, 2016) 95.1 91.4 LSTM 98.3 89.4 EURNN (Jing et al,2016) -93.7 TCN (Bai et al, 2018) 99.0 97.2 r-LSTM (Trinh et al, 2018) 98.4 95.2 IndRNN (Li et al, 2018) 99.0 96.0 KRU (Jose et al,2017) 96.4 94.5 Dilated GRU (Chang et al, 2018) We note that while IGLOO and CuDNN LSTM run at similar speed of 30 seconds per epoch, the LSTM is much slower and takes about 540 seconds per epoch for a 128 hidden layers cell. Therefore we achieve superior accuracy for the pMNIST benchmark with speed levels (per epoch) similar to the fast NVIDIA optimized CuDNN LSTM cell.…”
Section: Sequential Mnist and Permuted Mnistmentioning
confidence: 99%
“…Dealing with very long term dependencies is a current area of research and recent papers have introduced new variations which aim at fixing this issue and improve on the historical models: IndRNN (Shai et al, 2018), RNN with auxiliary losses (Trinh et al, 2018). Earlier works also include the uRNN (Arjovsky et al, 2016), Quasi-Recurrent Neural Networks (Q-RNN) (Bradbury et al, 2016), Dilated RNN (Chang et al, 2017), Recurrent additive networks (Lee et al, 2017), ChronoNet (Roy et al, 2018), EUNN (Jing et al, 2016), Kronecker Recurrent Units -KRU (Jose et al, 2017) and Recurrent Weight Average (J. Ostmeyer et al, 2017).…”
Section: Introductionmentioning
confidence: 99%
“…Unitary recurrent neural networks [36,37,38] refine vanilla RNNs by parametrizing their transition matrix to be unitary. These networks are reversible in exact arithmetic [36]: the conjugate transpose of the transition matrix is its inverse, so the hidden-to-hidden transition is reversible.…”
Section: Related Workmentioning
confidence: 99%