Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery &Amp; Data Mining 2018
DOI: 10.1145/3219819.3219886
|View full text |Cite
|
Sign up to set email alerts
|

Recommendations with Negative Feedback via Pairwise Deep Reinforcement Learning

Abstract: With the recent prevalence of Reinforcement Learning (RL), there have been tremendous interests in developing RL-based recommender systems. In practical recommendation sessions, users will sequentially access multiple scenarios, such as the entrance pages and the item detail pages, and each scenario has its own recommendation strategy. However, the majority of existing RL-based recommender systems focus on optimizing one strategy for all scenarios or separately optimizing each strategy, which could lead to sub… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
182
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
5
2
1

Relationship

2
6

Authors

Journals

citations
Cited by 297 publications
(192 citation statements)
references
References 35 publications
0
182
0
Order By: Relevance
“…In recent years, deep neural network models had a great impact on learning effective feature representations in various fields, such as speech recognition [12], Computer Vision (CV) [14] and Natural Language Processing (NLP) [4]. Some recent efforts have applied deep neural networks to recommendation tasks and shown promising results [41], but most of them used deep neural networks to model audio features of music [32], textual description of items [3,33], and visual content of images [40]. Besides, NeuMF [11] presented a Neural Collaborative Filtering framework to learn the non-linear interactions between users and items.…”
Section: Related Workmentioning
confidence: 99%
“…In recent years, deep neural network models had a great impact on learning effective feature representations in various fields, such as speech recognition [12], Computer Vision (CV) [14] and Natural Language Processing (NLP) [4]. Some recent efforts have applied deep neural networks to recommendation tasks and shown promising results [41], but most of them used deep neural networks to model audio features of music [32], textual description of items [3,33], and visual content of images [40]. Besides, NeuMF [11] presented a Neural Collaborative Filtering framework to learn the non-linear interactions between users and items.…”
Section: Related Workmentioning
confidence: 99%
“…Equation (5); 30 end 31 # Updating the S-Network. 32 for j = 1 : K do 33 Sample mini-batches of (s t , i t , r t , s t +1 ) from M; 34 Set f , d , l , r according to r t , s t +1 ; 35 Update θ s via mini-batch SGD w.r.t. the loss in Equation (7); will cause the selection base.…”
Section: Simulator Learningmentioning
confidence: 99%
“…Unfortunately, current methods including Monte Carlo (MC) and temporaldifference (TD) have limitations for offline policy learning in realistic recommender systems: MC-based methods suffer from the problem of high variance, especially when facing enormous action space (e.g., billions of candidate items) in real-world applications; TD-based methods improve the efficiency by using bootstrapping techniques in estimation, which, however, is confronted with another notorious problem called Deadly Triad (i.e., the problem of instability and divergence arises whenever combining function approximation, bootstrapping, and offline training [24]). Unfortunately, state-of-the-art methods [33,34] in recommender systems, which are designed with neural architectures, will encounter inevitably the Deadly Triad problem in offline policy learning.…”
Section: Introductionmentioning
confidence: 99%
“…This is possible because (i) edge storage and compute resources are more powerful with various system-on-chip (SoC) technologies and (ii) there is a dataprivacy practice to keep personal data locally. Further, due to its inherent capability of adaptive modeling and longterm planning, reinforcement learning presents potential in building interactive and personalized models, such as interactive recommendation systems [111], [112], [113].…”
Section: Model Training and Deploymentmentioning
confidence: 99%