2018
DOI: 10.48550/arxiv.1810.09401
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Alternating Linear Bandits for Online Matrix-Factorization Recommendation

Abstract: We consider the problem of online collaborative filtering in the online setting, where items are recommended to the users over time. At each time step, the user (selected by the environment) consumes an item (selected by the agent) and provides a rating of the selected item. In this paper, we propose a novel algorithm for online matrix factorization recommendation that combines linear bandits and alternating least squares. In this formulation, the bandit feedback is equal to the difference between the ratings … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
8
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
3

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(8 citation statements)
references
References 4 publications
0
8
0
Order By: Relevance
“…Note that the AM algorithm is a very strong baseline in practice for our problem setting. In [13], it was experimentally demonstrated for both synthetic and real datasets that the AM algorithm outperforms previously designed algorithms in the literature that can be applied to our problem setting ( [12] and [31]) by a significant margin.…”
Section: A Experimentsmentioning
confidence: 94%
See 2 more Smart Citations
“…Note that the AM algorithm is a very strong baseline in practice for our problem setting. In [13], it was experimentally demonstrated for both synthetic and real datasets that the AM algorithm outperforms previously designed algorithms in the literature that can be applied to our problem setting ( [12] and [31]) by a significant margin.…”
Section: A Experimentsmentioning
confidence: 94%
“…In addition, their distribution-free regret bounds have sub-optimal dependence in T; T 2/3 instead of √ T provided by our method. [13] provided an online alternating minimization heuristic for the general rank-r problem, but do not provide any regret bounds. In a separate line of work, [14,15,16,17,18,19,20] study a similar low-rank reward matrix setting but they consider a significantly easier objective of identifying the largest entry in the entire reward matrix/tensor instead of finding the most rewarding arms for each user/agent.…”
Section: Other Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Incorporating such information may lead to a better understanding of users' preference (Θ), which may in turn improve the recommendation performance (Y ). We focus on the graph Laplacian based regularizer, which has been widely adopted in the literature (Huang et al, 2018;Rao et al, 2015;Dadkhahi and Negahban, 2018;Yankelevsky and Elad, 2016) thanks to its mathematical regularity (e.g., convexity and differentiability). In particular, the least squares estimation regularized by tr(Θ T LΘ) is a convex program, and Θ can be computed by the well known Bartels-Stewart algorithm (Bartels and Stewart, 1972) or more efficient algorithms developed recently (Rao et al, 2015;Ji et al, 2018).…”
Section: Introductionmentioning
confidence: 99%
“…• We provide an analytic solution to the single-user estimation problem, which allows for the derivation of a tighter UCB, as well as a cumulative regret that scales linearly with a local smoothness measure of the user parameters. • We contribute broadly to the literature of graph-based data analysis, in particular signal processing and matrix factorization on graphs, by providing a theoretical analysis (which is largely absent in the literature) of the properties (e.g., convergence) of the graph-based estimator that frequently appears in these fields Yankelevsky and Elad [2016], Dong et al [2016], Nassif et al [2018], Dadkhahi and Negahban [2018], Rao et al [2015], Kalofolias et al [2014].…”
Section: Introductionmentioning
confidence: 99%