Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery &Amp; Data Mining 2020
DOI: 10.1145/3394486.3403351
|View full text |Cite
|
Sign up to set email alerts
|

A Sleeping, Recovering Bandit Algorithm for Optimizing Recurring Notifications

Abstract: Many online and mobile applications rely on daily emails and push notifications to increase and maintain user engagement. The multiarmed bandit approach provides a useful framework for optimizing the content of these notifications, but a number of complications (such as novelty effects and conditional eligibility) make conventional bandit algorithms unsuitable in practice. In this paper, we introduce the Recovering Difference Softmax Algorithm to address the particular challenges of this problem domain, and us… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
3
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(2 citation statements)
references
References 8 publications
0
2
0
Order By: Relevance
“…Zhao et al [2018] propose a machine learning approach to decide notification volume for each user. Yancey and Settles [2020] propose a multi-armed bandit approach for notification optimization. Yue et al [2022] propose a ranking solution to decide which notification to send to users.…”
Section: Related Workmentioning
confidence: 99%
“…Zhao et al [2018] propose a machine learning approach to decide notification volume for each user. Yancey and Settles [2020] propose a multi-armed bandit approach for notification optimization. Yue et al [2022] propose a ranking solution to decide which notification to send to users.…”
Section: Related Workmentioning
confidence: 99%
“…The authors compare their algorithm to a Bayesian d-step lookahead benchmark, which is the greedy algorithm optimizing the next d pulls given the decision maker's current situation. In comparison, our benchmark is concerned with the total reward of the whole time horizon T rather than a pre-fixed d. Some other related work include Mintz et al (2020) and Yancey & Settles (2020). In Mintz et al (2020), the recovery function is characterized via a parametric form, while the authors obtain a worst-case regret of O(T 2 3 ).…”
Section: Related Literaturementioning
confidence: 99%