2020
DOI: 10.48550/arxiv.2011.07931
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Do Offline Metrics Predict Online Performance in Recommender Systems?

Abstract: Recommender systems operate in an inherently dynamical setting. Past recommendations influence future behavior, including which data points are observed and how user preferences change. However, experimenting in production systems with real user dynamics is often infeasible, and existing simulation-based approaches have limited scale. As a result, many state-ofthe-art algorithms are designed to solve supervised learning problems, and progress is judged only by offline metrics. In this work we investigate the e… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
18
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
4
1

Relationship

3
2

Authors

Journals

citations
Cited by 9 publications
(18 citation statements)
references
References 38 publications
0
18
0
Order By: Relevance
“…Preference models We consider two preference models: one based on matrix factorization (MF) as well as a neighborhood based model (KNN). We use the LibFM SGD implementation [Rendle, 2012] for the MF model and use the item-based k-nearest neighbors model implemented by Krauth et al [2020]. For each dataset and recommender model we perform hyper-parameter tuning using a 10%-90% test-train split.…”
Section: Methodsmentioning
confidence: 99%
See 2 more Smart Citations
“…Preference models We consider two preference models: one based on matrix factorization (MF) as well as a neighborhood based model (KNN). We use the LibFM SGD implementation [Rendle, 2012] for the MF model and use the item-based k-nearest neighbors model implemented by Krauth et al [2020]. For each dataset and recommender model we perform hyper-parameter tuning using a 10%-90% test-train split.…”
Section: Methodsmentioning
confidence: 99%
“…Empirical studies of human behavior find mixed results on the relationship between recommendation and content diversity [Nguyen et al, 2014, Flaxman et al, 2016. Simulation studies [Chaney et al, 2018, Yao et al, 2021, Krauth et al, 2020 and theoretical investigations [Dandekar et al, 2013] shed light on phenomena in simplified settings, showing how homogenization, popularity bias, performance, and polarization depend on assumed user behavior models. Even ensuring accuracy in sequential dynamic settings requires contending with closed-loop behaviors.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…We generate the synthetic dataset using a modified version of the latent-static environment from the RecLab simulation platform [20].…”
Section: Empirical Setting and Methodsmentioning
confidence: 99%
“…Given the usefulness of simulations, many simulation frameworks have been developed to study various fairness approaches for information retrieval systems; just to mention a few: MARS-Gym [139], ML-fairness-gym [41], Accordion [108], RecLab [90], RecSim NG [115], SIREN [23], T-RECS [105], RecoGym [137], AESim [59], Virtual-Taobao [146].…”
Section: Simulation and Applied Modeling To Study Long-term Effects A...mentioning
confidence: 99%