Alternating Linear Bandits for Online Matrix-Factorization Recommendation

Dadkhahi, Hamid; Negahban, Sahand

doi:10.48550/arxiv.1810.09401

Cited by 3 publications

(8 citation statements)

References 4 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Note that the AM algorithm is a very strong baseline in practice for our problem setting. In [13], it was experimentally demonstrated for both synthetic and real datasets that the AM algorithm outperforms previously designed algorithms in the literature that can be applied to our problem setting ( [12] and [31]) by a significant margin.…”

Section: A Experimentsmentioning

confidence: 94%

“…In addition, their distribution-free regret bounds have sub-optimal dependence in T; T 2/3 instead of √ T provided by our method. [13] provided an online alternating minimization heuristic for the general rank-r problem, but do not provide any regret bounds. In a separate line of work, [14,15,16,17,18,19,20] study a similar low-rank reward matrix setting but they consider a significantly easier objective of identifying the largest entry in the entire reward matrix/tensor instead of finding the most rewarding arms for each user/agent.…”

Section: Other Related Workmentioning

confidence: 99%

“…Next we perform detailed experiments with Algorithm 4 on synthetic datasets that are generated as described below. Note that we compare against the Alternating Minimization (AM) algorithm presented in [13] for solving the online multi-user multi-armed bandit problem when the reward matrix is of low-rank. Note that the AM algorithm is a very strong baseline in practice for our problem setting.…”

Section: A Experimentsmentioning

confidence: 99%

See 2 more Smart Citations

Optimal Algorithms for Latent Bandits with Cluster Structure

Pal¹,

Suggala²,

Shanmugam³

et al. 2023

Preprint

View full text Add to dashboard Cite

We consider the problem of latent bandits with cluster structure where there are multiple users, each with an associated multi-armed bandit problem. These users are grouped into latent clusters such that the mean reward vectors of users within the same cluster are identical. At each round, a user, selected uniformly at random, pulls an arm and observes a corresponding noisy reward. The goal of the users is to maximize their cumulative rewards. This problem is central to practical recommendation systems and has received wide attention of late [1,2]. Now, if each user acts independently, then they would have to explore each arm independently and a regret of Ω( √ MNT) is unavoidable, where M, N are the number of arms and users, respectively. Instead, we propose LATTICE (Latent bAndiTs via maTrIx ComplEtion) which allows exploitation of the latent cluster structure to provide the minimax optimal regret of O( (M + N)T), when the number of clusters is Õ(1). This is the first algorithm to guarantee such strong regret bound. LATTICE is based on a careful exploitation of arm information within a cluster while simultaneously clustering users. Furthermore, it is computationally efficient and requires only O(log T) calls to an offline matrix completion oracle across all T rounds.

show abstract

Section: A Experimentsmentioning

confidence: 94%

Section: Other Related Workmentioning

confidence: 99%

Section: A Experimentsmentioning

confidence: 99%

See 1 more Smart Citation

Optimal Algorithms for Latent Bandits with Cluster Structure

Pal¹,

Suggala²,

Shanmugam³

et al. 2023

Preprint

View full text Add to dashboard Cite

show abstract

“…Incorporating such information may lead to a better understanding of users' preference (Θ), which may in turn improve the recommendation performance (Y ). We focus on the graph Laplacian based regularizer, which has been widely adopted in the literature (Huang et al, 2018;Rao et al, 2015;Dadkhahi and Negahban, 2018;Yankelevsky and Elad, 2016) thanks to its mathematical regularity (e.g., convexity and differentiability). In particular, the least squares estimation regularized by tr(Θ T LΘ) is a convex program, and Θ can be computed by the well known Bartels-Stewart algorithm (Bartels and Stewart, 1972) or more efficient algorithms developed recently (Rao et al, 2015;Ji et al, 2018).…”

Section: Introductionmentioning

confidence: 99%

Error Analysis on Graph Laplacian Regularized Estimator

Yang¹,

Dong²,

Toni³

2019

Preprint

View full text Add to dashboard Cite

We provide a theoretical analysis of the representation learning problem aimed at learning the latent variables (design matrix) Θ of observations Y with the knowledge of the coefficient matrix X. The design matrix is learned under the assumption that the latent variables Θ are smooth with respect to a (known) topological structure G. To learn such latent variables, we study a graph Laplacian regularized estimator, which is the penalized least squares estimator with penalty term proportional to a Laplacian quadratic form. This type of estimators has recently received considerable attention due to its capability in incorporating underlying topological graph structure of variables into the learning process. While the estimation problem can be solved efficiently by state-of-the-art optimization techniques, its statistical consistency properties have been largely overlooked. In this work, we develop a non-asymptotic bound of estimation error under the classical statistical setting, where sample size is larger than the ambient dimension of the latent variables. This bound illustrates theoretically the impact of the alignment between the data and the graph structure as well as the graph spectrum on the estimation accuracy. It also provides theoretical evidence of the advantage, in terms of convergence rate, of the graph Laplacian regularized estimator over classical ones (that ignore the graph structure) in case of a smoothness prior. Finally, we provide empirical results of the estimation error to corroborate the theoretical analysis.

show abstract

“…• We provide an analytic solution to the single-user estimation problem, which allows for the derivation of a tighter UCB, as well as a cumulative regret that scales linearly with a local smoothness measure of the user parameters. • We contribute broadly to the literature of graph-based data analysis, in particular signal processing and matrix factorization on graphs, by providing a theoretical analysis (which is largely absent in the literature) of the properties (e.g., convergence) of the graph-based estimator that frequently appears in these fields Yankelevsky and Elad [2016], Dong et al [2016], Nassif et al [2018], Dadkhahi and Negahban [2018], Rao et al [2015], Kalofolias et al [2014].…”

Section: Introductionmentioning

confidence: 99%

Laplacian-regularized graph bandits: Algorithms and theoretical analysis

Yang,

Dong,

Toni

2019

Preprint

View full text Add to dashboard Cite

We study contextual multi-armed bandit problems in the case of multiple users, where we exploit the structure in the user domain to reduce the cumulative regret. Specifically, we model user relation as a graph, and assume that the parameters (preferences) of users form smooth signals on the graph. This leads to a graph Laplacian-regularized estimator, for which we propose a novel bandit algorithm whose performance depends on a notion of local smoothness on the graph. We provide a closed-form solution to the estimator, enabling a theoretical analysis on the convergence property of the estimator as well as single-user upper confidence bound (UCB) and cumulative regret of the proposed bandit algorithm. Furthermore, we show that the regret scales linearly with the local smoothness measure, which approaches zero for densely connected graph. The single-user UCB also allows us to further propose an extension of the bandit algorithm, whose computational complexity scales linearly with the number of users. We support theoretical claims with empirical evidences, and demonstrate the advantage of the proposed algorithm in comparison with state-of-the-art graph-based bandit algorithms on both synthetic and real-world datasets.Preprint. Under review.

show abstract

Alternating Linear Bandits for Online Matrix-Factorization Recommendation

Cited by 3 publications

References 4 publications

Optimal Algorithms for Latent Bandits with Cluster Structure

Optimal Algorithms for Latent Bandits with Cluster Structure

Error Analysis on Graph Laplacian Regularized Estimator

Laplacian-regularized graph bandits: Algorithms and theoretical analysis

Contact Info

Product

Resources

About