Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining 2023
DOI: 10.1145/3539597.3570486
|View full text |Cite
|
Sign up to set email alerts
|

Marketing Budget Allocation with Offline Constrained Deep Reinforcement Learning

Abstract: We study the budget allocation problem in online marketing campaigns that utilize previously collected offline data. We first discuss the long-term effect of optimizing marketing budget allocation decisions in the offline setting. To overcome the challenge, we propose a novel game-theoretic offline value-based reinforcement learning method using mixed policies. The proposed method reduces the need to store infinitely many policies in previous methods to only constantly many policies, which achieves nearly opti… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
2
1

Relationship

1
2

Authors

Journals

citations
Cited by 3 publications
(2 citation statements)
references
References 36 publications
(72 reference statements)
0
2
0
Order By: Relevance
“…"Marketing Budget Allocation with Offline Constrained Deep Reinforcement Learning" [3] T. Cai et al, "Marketing budget allocation with offline constrained deep reinforcement learning," This research paper explores using Reinforcement Learning (RL) for budget allocation in online marketing campaigns, especially for user acquisition and retention. It highlights the limitations of traditional methods relying on immediate user responses (e.g., coupon redemption).…”
Section: 17mentioning
confidence: 99%
“…"Marketing Budget Allocation with Offline Constrained Deep Reinforcement Learning" [3] T. Cai et al, "Marketing budget allocation with offline constrained deep reinforcement learning," This research paper explores using Reinforcement Learning (RL) for budget allocation in online marketing campaigns, especially for user acquisition and retention. It highlights the limitations of traditional methods relying on immediate user responses (e.g., coupon redemption).…”
Section: 17mentioning
confidence: 99%
“…It learns desired behaviors from interactions with an environment to maximize the long-term cumulative reward [1,32]. Compared to conventional methods such as collaborative filtering [22] and deep learning-based methods [10,18,43], RL is capable of handling partial feedback and optimizing long-term experience, hence it is promising in many real-world recommendation scenarios [4,5,39,42].…”
Section: Introductionmentioning
confidence: 99%