A Unified View of Matrix Factorization Models

Singh, Ajit P.; Gordon, Geoffrey J.

doi:10.1007/978-3-540-87481-2_24

Cited by 107 publications

(86 citation statements)

References 35 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…However, this is not the only way to deal with this kind of problem. For instance, matrix co-factorization (see, e.g., [28]) and tensor co-factorization can be another paradigm of combining explicit features and hidden features.…”

Section: Summary and Discussionmentioning

confidence: 99%

Learning to rank social update streams

Liu

Bekkerman

Adler

et al. 2012

Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval

View full text Add to dashboard Cite

As online social media further integrates deeper into our lives, we spend more time consuming social update streams that come from our online connections. Although social update streams provide a tremendous opportunity for us to access information on-the-fly, we often complain about its relevance. Some of us are flooded with a steady stream of information and simply cannot process it in full. Ranking the incoming content becomes the only solution for the overwhelmed users. For some others, in contrast, the incoming information stream is pretty weak, and they have to actively search for relevant information which is quite tedious. For these users, augmenting their incoming content flow with relevant information from outside their first-degree network would be a viable solution. In that case, the problem of relevance becomes even more prominent.In this paper, we start an open discussion on how to build effective systems for ranking social updates from a unique perspective of LinkedIn -the largest professional network in the world. More specifically, we address this problem as an intersection of learning to rank, collaborative filtering, and clickthrough modeling, while leveraging ideas from information retrieval and recommender systems. We propose a novel probabilistic latent factor model with regressions on explicit features and compare it with a number of non-trivial baselines. In addition to demonstrating superior performance of our model, we shed some light on the nature of social updates on LinkedIn and how users interact with them, which might be applicable to social update streams in general.

show abstract

Section: Summary and Discussionmentioning

confidence: 99%

Learning to rank social update streams

Liu

Bekkerman

Adler

et al. 2012

Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval

View full text Add to dashboard Cite

show abstract

“…Similarly, Chapter 8 in Tropp's 2004 PhD thesis [Tro04] explored a number of new regularizers, presenting a range of clustering problems as matrix factorization problems with constraints, and anticipated the k-SVD algorithm [AEB06]. Singh and Gordon [SG08] offered a complete view of the state of the literature on matrix factorization in Table 1 of their 2008 paper, and noted that by changing the loss function and regularizer, one may recover algorithms including PCA, weighted PCA, k-means, k-medians, 1 SVD, probabilistic latent semantic indexing (pLSI), nonnegative matrix factorization with 2 or KL-divergence loss, exponential family PCA, and MMMF. Witten et al introduced the statistics community to sparsity-inducing matrix factorization in a 2009 paper on penalized matrix decomposition, with applications to sparse PCA and canonical correlation analysis [WTH09].…”

Section: Gordon's Generalizedmentioning

confidence: 99%

“…For example, there are variants on alternating minimization (with alternating least squares as a special case) [DLYT76, YDLT76, TYDL77, DL84, DLM09], alternating Newton methods [Gor02,SG08], (stochastic or incremental) gradient descent [KO09, LRS + 10, NRRW11, RRWN11, BRRT12, YYH + 13, RR13], conjugate gradients [RS05,SJ03], expectation minimization (EM) (or "soft-impute") methods [TB99,SJ03,MHT10,HMLZ14], multiplicative updates [LS99], and convex relaxations to semidefinite programs [SRJ04,FHB04,RFP10,FM13].…”

Section: Gordon's Generalizedmentioning

confidence: 99%

Generalized Low Rank Models

Udell

Horn

Zadeh

et al. 2016

FNT in Machine Learning

196

129

View full text Add to dashboard Cite

Principal components analysis (PCA) is a well-known technique for approximating a tabular data set by a low rank matrix. Here, we extend the idea of PCA to handle arbitrary data sets consisting of numerical, Boolean, categorical, ordinal, and other data types. This framework encompasses many well known techniques in data analysis, such as nonnegative matrix factorization, matrix completion, sparse and robust PCA, k-means, k-SVD, and maximum margin matrix factorization. The method handles heterogeneous data sets, and leads to coherent schemes for compressing, denoising, and imputing missing entries across all data types simultaneously. It also admits a number of interesting interpretations of the low rank factors, which allow clustering of examples or of features. We propose several parallel algorithms for fitting generalized low rank models, and describe implementations and numerical results.

show abstract

“…Different approaches have been unified under a generalized Bregman divergence theory [10]. Matrix factorization has been applied in domains involving time-series data as in music transcription [11], up to EEG processing [12] In comparison we are going to use very fast variations of matrix factorization with very low dimensions, fast learning rate and early stopping.…”

Section: Matrix Factorizationmentioning

confidence: 99%

“…Then the objective function is solved for w and w 0 and the dual form is yield as shown in Equation 10.…”

Section: Support Vector Machinesmentioning

confidence: 99%

Efficient Classification of Long Time-Series

Grabocka

Bedalli

Schmidt-Thieme

2013

ICT Innovations 2012

View full text Add to dashboard Cite

Abstract. Time-series classification has gained wide attention within the Machine Learning community, due to its large range of applicability varying from medical diagnosis, financial markets, up to shape and trajectory classification. The current state-of-art methods applied in timeseries classification rely on detecting similar instances through neighboring algorithms. Dynamic Time Warping (DTW) is a similarity measure that can identify the similarity of two time-series, through the computation of the optimal warping alignment of time point pairs, therefore DTW is immune towards patterns shifted in time or distorted in size/shape. Unfortunately the classification time complexity of computing the DTW distance of two series is quadratic, subsequently DTW based nearest neighbor classification deteriorates to quartic order of time complexity per test set. The high time complexity order causes the classification of long time series to be practically infeasible. In this study we propose a fast linear classification complexity method. Our method projects the original data to a reduced latent dimensionality using matrix factorization, while the factorization is learned efficiently via stochastic gradient descent with fast convergence rates and early stopping. The latent data dimensionality is set to be as low as the cardinality of the label variable. Finally, Support Vector Machines with polynomial kernels are applied to classify the reduced dimensionality data. Experimentations over long time series datasets from the UCR collection demonstrate the superiority of our method, which is orders of magnitude faster than baselines while being superior even in terms of classification accuracy.

show abstract

A Unified View of Matrix Factorization Models

Cited by 107 publications

References 35 publications

Learning to rank social update streams

Learning to rank social update streams

Generalized Low Rank Models

Efficient Classification of Long Time-Series

Contact Info

Product

Resources

About