2018
DOI: 10.48550/arxiv.1811.06569
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Stable Tensor Neural Networks for Rapid Deep Learning

Elizabeth Newman,
Lior Horesh,
Haim Avron
et al.

Abstract: We propose a tensor neural network (t-NN) framework that offers an exciting new paradigm for designing neural networks with multidimensional (tensor) data. Our network architecture is based on the t-product [16], an algebraic formulation to multiply tensors via circulant convolution. In this t-product algebra, we interpret tensors as t-linear operators analogous to matrices as linear operators, and hence our framework inherits mimetic matrix properties. To exemplify the elegant, matrix-mimetic algebraic struct… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
9
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
3
2
1

Relationship

1
5

Authors

Journals

citations
Cited by 8 publications
(9 citation statements)
references
References 23 publications
0
9
0
Order By: Relevance
“…A major advantage of ELM is that learning the linear mapping of the hidden-layer outputs to the output layer is relatively simple and is independent of the activation functions used. Newman et al proposed a tensor neural network intended for tensor data [27]. Their proposed networks use tensor-tensor products and enable the use of more compact parameter spaces [28].…”
Section: Tensor Linear Systemsmentioning
confidence: 99%
See 2 more Smart Citations
“…A major advantage of ELM is that learning the linear mapping of the hidden-layer outputs to the output layer is relatively simple and is independent of the activation functions used. Newman et al proposed a tensor neural network intended for tensor data [27]. Their proposed networks use tensor-tensor products and enable the use of more compact parameter spaces [28].…”
Section: Tensor Linear Systemsmentioning
confidence: 99%
“…Randomized Kaczmarz is closely related to the popular optimization technique, stochastic gradient descent (SGD) [26]. Most related to this work is the tensor stochastic gradient descent that was recently implemented to train tensor neural networks under the t-product [27]. The focus of the aforementioned work is a tensor neural network framework for multidimensional data and does not delve into an algorithmic analysis of SGD under the t-product.…”
Section: Tensor Linear Systemsmentioning
confidence: 99%
See 1 more Smart Citation
“…where A ∈ R m×n×l , X ∈ R n×p×l and B ∈ R m×p×l are third-order tensors, and the operator * denotes the T-product introduced by Kilmer and Martin [1]. The problem (1.1) arises in many applications, including tensor dictionary learning [2][3][4][5][6][7], tensor neural network [8], boundary finite element method [9][10][11], etc. For T-product, it has an advantage that it can reserve the information inherent in the flattening of a tensor and, with it, many properties of numerical linear algebra can be extend to third and high order tensors [12][13][14][15][16][17][18].…”
Section: Introductionmentioning
confidence: 99%
“…where 'bcirc(A)' is the block circulant matrix [19] generated by the F-square tensor A ∈ C n×n×p . The T-function is also proved to be useful in stable tensor neural networks for rapid deep learning [31]. Special kinds of T-function such as tensor power has been used by Gleich, Chen and Varah [10] in Arnoldi methods to compute the eigenvalues of tensors and diagonal tensor canonical form was also proposed by them.…”
Section: Introductionmentioning
confidence: 99%