Tensor decomposition is a fundamental unsupervised machine learning method in data science, with applications including network analysis and sensor data processing. This work develops a generalized canonical polyadic (GCP) low-rank tensor decomposition that allows other loss functions besides squared error. For instance, we can use logistic loss or Kullback-Leibler divergence, enabling tensor decomposition for binary or count data. We present a variety of statisticallymotivated loss functions for various scenarios. We provide a generalized framework for computing gradients and handling missing data that enables the use of standard optimization methods for fitting the model. We demonstrate the flexibility of GCP on several real-world examples including interactions in a social network, neural activity in a mouse, and monthly rainfall measurements in India.
Abstract. The dominant contribution to communication complexity in factorizing a matrix using QR with column pivoting is due to column-norm updates that are required to process pivot decisions. We use randomized sampling to approximate this process which dramatically reduces communication in column selection. We also introduce a sample update formula to reduce the cost of sampling trailing matrices. Using our column selection mechanism we observe results that are comparable in quality to those obtained from the QRCP algorithm, but with performance near unpivoted QR. We also demonstrate strong parallel scalability on shared memory multiple core systems using an implementation in Fortran with OpenMP.This work immediately extends to produce low-rank truncated approximations of large matrices. We propose a truncated QR factorization with column pivoting that avoids trailing matrix updates which are used in current implementations of level-3 BLAS QR and QRCP. Provided the truncation rank is small, avoiding trailing matrix updates reduces approximation time by nearly half. By using these techniques and employing a variation on Stewart's QLP algorithm, we develop an approximate truncated SVD that runs nearly as fast as truncated QR.Keywords. QR factorization, column pivoting, random sampling, sample update, blocked algorithm, low-rank approximation, truncated SVD 1. Introduction. We explore a variation of QR with Column Pivoting (QRCP) using randomized sampling (RQRCP) to process blocks of pivots. Magnitudes of trailing column norms are detected using Gaussian random compression matrices to produce smaller sample matrices. We analyze the probability distributions of sample column norms to justify the internal updating computation used to select blocks of column pivots. P.G. Martinsson [16] independently developed a very similar approach in parallel to this research. The primary difference is our introduction of a sample update formula which reduces matrix-multiplication complexity required to process the full matrix factorization by one third of what would be required by re-sampling after processing each block of column pivots.We also extend this method of factorization to produce truncated low-rank approximations. We propose an implementation that avoids the trailing update computation on the full matrix. This further reduces time spent in matrix multiplication to nearly half of what would be required by a truncated version with trailing update that also employs one of our sample update formulas. Furthermore, the Truncated Randomized QR with Column Pivoting algorithm (TRQRCP) immediately extends to approximate the truncated SVD using a variation on Stewart's QLP algorithm [20].We are able to achieve matrix factorizations of similar quality to standard QRCP while retaining communication complexity of unpivoted QR. Algorithms have been implemented and tested in Fortran with OpenMP on shared-memory 24-core systems. Our performance experiments compare these algorithms against LAPACK subroutines linked with the Intel Math...
Locally Optimal Block Preconditioned Conjugate Gradient (LOBPCG) is widely used to compute eigenvalues of large sparse symmetric matrices. The algorithm can suffer from numerical instability if it is not implemented with care. This is especially problematic when the number of eigenpairs to be computed is relatively large. In this paper we propose an improved basis selection strategy based on earlier work by Hetmaniuk and Lehoucq as well as a robust convergence criterion which is backward stable to enhance the robustness. We also suggest several algorithmic optimizations that improve performance of practical LOBPCG implementations. Numerical examples confirm that our approach consistently and significantly outperforms previous competing approaches in both stability and speed.from which the approximate solution to the eigenvalue problem is extracted can become linearly dependent. This problem becomes progressively worse when the number of eigenpairs to be computed becomes relatively large (e.g., hundreds or thousands). For example, in electronic structure calculations, the number of desired eigenpairs is proportional to the number of atoms in the system, which can grow to several thousands [13]. Hence remedies for improving numerical stability are of practical interest.A strategy proposed in the work of Hetmaniuk and Lehoucq [8] addresses this issue. Their strategy is based on performing additional orthogonalization to ensure that the preconditioned gradient is numerically B-orthogonal to both the current and the previous approximations to the desired eigenvectors. However, this strategy can become expensive when the number of eigenpairs to be computed is relatively large. More importantly, reliability can still be severely compromised due to numerical instability within the orthogonalization steps.This paper presents an efficient and reliable implementation of LOBPCG. We develop a number of techniques to significantly enhance the Hetmaniuk-Lehoucq (HL) orthogonalization strategy in both efficiency and reliability. We also adopt an alternative convergence criterion to ensure achievable error control in computed eigenpairs. For simplicity, we assume that both A and B are real matrices. But our techniques naturally carry over to complex Hermitian matrices.The rest of this paper is organized as follows. In Section 2, we describe the basic LOBPCG algorithm. In Section 3, we discuss numerical difficulties one may encounter in LOBPCG and the HL strategy for overcoming these difficulties. In Section 4, we present our techniques for improving the HL strategy. In Section 5, we present additional techniques for improving all other aspects of LOBPCG. Finally, in Section 6, we report numerical experimental results to illustrate the effectiveness of our techniques.
Information theory provides a mathematical foundation to measure uncertainty in belief. Belief is represented by a probability distribution that captures our understanding of an outcome's plausibility. Information measures based on Shannon's concept of entropy include realization information, Kullback-Leibler divergence, Lindley's information in experiment, cross entropy, and mutual information.We derive a general theory of information from first principles that accounts for evolving belief and recovers all of these measures. Rather than simply gauging uncertainty, information is understood in this theory to measure change in belief. We may then regard entropy as the information we expect to gain upon realization of a discrete latent random variable.This theory of information is compatible with the Bayesian paradigm in which rational belief is updated as evidence becomes available. Furthermore, this theory admits novel measures of information with well-defined properties, which we explore in both analysis and experiment. This view of information illuminates the study of machine learning by allowing us to quantify information captured by a predictive model and distinguish it from residual information contained in training data. We gain related insights regarding feature selection, anomaly detection, and novel Bayesian approaches.Postulate 3. If belief does not change then no information is gained, regardless of the view of expectation,Postulate 4. The information gained from any normalized prior state of belief q 0 (z) to an updated state of belief r(z) in the view of r(z) must be nonnegative
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.