Stable Tensor Neural Networks for Rapid Deep Learning

Newman, Elizabeth; Horesh, Lior; Avron, Haim; Kilmer, Misha

doi:10.48550/arxiv.1811.06569

Cited by 8 publications

(9 citation statements)

References 23 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…A major advantage of ELM is that learning the linear mapping of the hidden-layer outputs to the output layer is relatively simple and is independent of the activation functions used. Newman et al proposed a tensor neural network intended for tensor data [27]. Their proposed networks use tensor-tensor products and enable the use of more compact parameter spaces [28].…”

Section: Tensor Linear Systemsmentioning

confidence: 99%

“…Randomized Kaczmarz is closely related to the popular optimization technique, stochastic gradient descent (SGD) [26]. Most related to this work is the tensor stochastic gradient descent that was recently implemented to train tensor neural networks under the t-product [27]. The focus of the aforementioned work is a tensor neural network framework for multidimensional data and does not delve into an algorithmic analysis of SGD under the t-product.…”

Section: Tensor Linear Systemsmentioning

confidence: 99%

“…Initially motivated for tensor factorization, use of the t-product has become prominent in the tensor and signal processing community. Under the t-product, tensors enjoy a linear algebraic-like framework that has proved useful in applications such as dictionary learning [33,38], low-rank tensor completion [41,32,39,40], facial recognition [11], and neural networks [27,36]. The process of naively transforming high-order tensors into two dimensional arrays via a flattening or unfolding process is often referred to as "matricization".…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Randomized Kaczmarz for Tensor Linear Systems

Molitor

2020

Preprint

View full text Add to dashboard Cite

Solving linear systems of equations is a fundamental problem in mathematics. When the linear system is so large that it cannot be loaded into memory at once, iterative methods such as the randomized Kaczmarz method excel. Here, we extend the randomized Kaczmarz method to solve multi-linear (tensor) systems under the tensor-tensor t-product. We provide convergence guarantees for the proposed tensor randomized Kaczmarz that are analogous to those of the randomized Kaczmarz method for matrix linear systems. We demonstrate experimentally that the tensor randomized Kaczmarz method converges faster than traditional randomized Kaczmarz applied to a naively matricized version of the linear system. In addition, we draw connections between the proposed algorithm and a previously known extension of the randomized Kaczmarz algorithm for matrix linear systems.

show abstract

Section: Tensor Linear Systemsmentioning

confidence: 99%

Section: Tensor Linear Systemsmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Randomized Kaczmarz for Tensor Linear Systems

Molitor

2020

Preprint

View full text Add to dashboard Cite

show abstract

“…where A ∈ R m×n×l , X ∈ R n×p×l and B ∈ R m×p×l are third-order tensors, and the operator * denotes the T-product introduced by Kilmer and Martin [1]. The problem (1.1) arises in many applications, including tensor dictionary learning [2][3][4][5][6][7], tensor neural network [8], boundary finite element method [9][10][11], etc. For T-product, it has an advantage that it can reserve the information inherent in the flattening of a tensor and, with it, many properties of numerical linear algebra can be extend to third and high order tensors [12][13][14][15][16][17][18].…”

Section: Introductionmentioning

confidence: 99%

Sketch-and-project methods for tensor linear systems

TANG¹,

Yajie²,

Zhang³

et al. 2022

Preprint

View full text Add to dashboard Cite

We first extend the famous sketch-and-project method and its adaptive variants for matrix linear systems to tensor linear systems with respect to the popular T-product of tensors. Their Fourier domain versions are also investigated. Then, considering that the existing sketching tensor or the way for sampling has some limitations, we propose two improved strategies. Convergence analysis for all the methods mentioned above are provided. We compare our methods with the existing ones using synthetic and real data. Numerical results show that they have quite decent performance.

show abstract

“…where 'bcirc(A)' is the block circulant matrix [19] generated by the F-square tensor A ∈ C n×n×p . The T-function is also proved to be useful in stable tensor neural networks for rapid deep learning [31]. Special kinds of T-function such as tensor power has been used by Gleich, Chen and Varah [10] in Arnoldi methods to compute the eigenvalues of tensors and diagonal tensor canonical form was also proposed by them.…”

Section: Introductionmentioning

confidence: 99%