Deep reinforcement learning for page-wise recommendations

Zhao, Xiangyu; Xia, Long; Zhang, Liang; Ding, Zhuoye; Yin, Dawei; Tang, Jiliang

doi:10.1145/3240323.3240374

Cited by 314 publications

(217 citation statements)

References 31 publications

Supporting

Mentioning

203

Contrasting

Order By: Relevance

“…On the other hand, although Wu et al [28] proposed to optimize the delayed revisiting time, there is no systematical solution to optimizing delayed metrics for user engagement. Apart from contextual bandits, a series of MDP based models [5,14,15,23,32,35,39] are proposed in recommendation task. Arnold et al [5] proposed a modified DDPG model to deal with the problem of large discrete action spaces.…”

Section: Reinforcement Learning Based Recommender Systemmentioning

confidence: 99%

“…Unfortunately, current methods including Monte Carlo (MC) and temporaldifference (TD) have limitations for offline policy learning in realistic recommender systems: MC-based methods suffer from the problem of high variance, especially when facing enormous action space (e.g., billions of candidate items) in real-world applications; TD-based methods improve the efficiency by using bootstrapping techniques in estimation, which, however, is confronted with another notorious problem called Deadly Triad (i.e., the problem of instability and divergence arises whenever combining function approximation, bootstrapping, and offline training [24]). Unfortunately, state-of-the-art methods [33,34] in recommender systems, which are designed with neural architectures, will encounter inevitably the Deadly Triad problem in offline policy learning.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Reinforcement Learning to Optimize Long-term User Engagement in Recommender Systems

Zou

Xia

Ding

et al. 2019

Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery &Amp; Data Mining

Self Cite

187

120

View full text Add to dashboard Cite

Recommender systems play a crucial role in our daily lives. Feed streaming mechanism has been widely used in the recommender system, especially on the mobile Apps. The feed streaming setting provides users the interactive manner of recommendation in never-ending feeds. In such a manner, a good recommender system should pay more attention to user stickiness, which is far beyond classical instant metrics and typically measured by long-term user engagement. Directly optimizing long-term user engagement is a non-trivial problem, as the learning target is usually not available for conventional supervised learning methods. Though reinforcement learning (RL) naturally fits the problem of maximizing the long term rewards, applying RL to optimize long-term user engagement is still facing challenges: user behaviors are versatile to model, which typically consists of both instant feedback (e.g., clicks) and delayed feedback (e.g., dwell time, revisit); in addition, performing effective off-policy learning is still immature, especially when combining bootstrapping and function approximation.To address these issues, in this work, we introduce a RL framework -FeedRec to optimize the long-term user engagement. Fee-dRec includes two components: 1) a Q-Network which designed in hierarchical LSTM takes charge of modeling complex user behaviors, and 2) a S-Network, which simulates the environment, assists the Q-Network and voids the instability of convergence in policy learning. Extensive experiments on synthetic data and a real-world large scale data show that FeedRec effectively optimizes the longterm user engagement and outperforms state-of-the-arts. CCS CONCEPTS• Information systems → Recommender systems; Personalization; • Theory of computation → Sequential decision making. * Work performed during an internship at JD.com.

show abstract

Section: Reinforcement Learning Based Recommender Systemmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Reinforcement Learning to Optimize Long-term User Engagement in Recommender Systems

Zou

Xia

Ding

et al. 2019

Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery &Amp; Data Mining

Self Cite

187

120

View full text Add to dashboard Cite

show abstract

“…Matrix factorization based algorithms [27,28] are widely used to tackle recommendation problems. Recently, recommendation algorithms achieve remarkable improvements during these years with help of deep learning models [12,14,30,47,49] and the successful introducing of side-information [15,19,22,23,34]. In this study, we focus on the introducing of side-information in the knowledge graph for the recommendation, and there already two types of studies using the knowledge graph in the recommendation: path-based and embedding learning based.…”

Section: Related Workmentioning

confidence: 99%

Jointly Learning Explainable Rules for Recommendation with Knowledge Graph

Zhang

Cao

et al. 2019

The World Wide Web Conference

167

View full text Add to dashboard Cite

Explainability and effectiveness are two key aspects for building recommender systems. Prior efforts mostly focus on incorporating side information to achieve better recommendation performance. However, these methods have some weaknesses: (1) prediction of neural network-based embedding methods are hard to explain and debug;(2) symbolic, graph-based approaches (e.g., meta path-based models) require manual efforts and domain knowledge to define patterns and rules, and ignore the item association types (e.g. substitutable and complementary). In this paper, we propose a novel joint learning framework to integrate induction of explainable rules from knowledge graph with construction of a rule-guided neural recommendation model. The framework encourages two modules to complement each other in generating effective and explainable recommendation: 1) inductive rules, mined from item-centric knowledge graphs, summarize common multi-hop relational patterns for inferring different item associations and provide human-readable explanation for model prediction; 2) recommendation module can be augmented by induced rules and thus have better generalization ability dealing with the cold-start issue. Extensive experiments 1 show that our proposed method has achieved significant improvements in item recommendation over baselines on real-world datasets. Our model demonstrates robust performance over "noisy" item knowledge graphs, generated by linking item names to related entities. ACM Reference Format:

show abstract

“…This is possible because (i) edge storage and compute resources are more powerful with various system-on-chip (SoC) technologies and (ii) there is a dataprivacy practice to keep personal data locally. Further, due to its inherent capability of adaptive modeling and longterm planning, reinforcement learning presents potential in building interactive and personalized models, such as interactive recommendation systems [111], [112], [113].…”

Section: Model Training and Deploymentmentioning

confidence: 99%

The Disruptions of 5G on Data-Driven Technologies and Applications

Loghin

Cai

Chen

et al. 2020

IEEE Trans. Knowl. Data Eng.

View full text Add to dashboard Cite

With 5G on the verge of being adopted as the next mobile network, there is a need to analyze its impact on the landscape of computing and data management. In this paper, we analyze the impact of 5G on both traditional and emerging technologies and project our view on future research challenges and opportunities. With a predicted increase of 10-100x in bandwidth and 5-10x decrease in latency, 5G is expected to be the main enabler for smart cities, smart IoT and efficient healthcare, where machine learning is conducted at the edge. In this context, we investigate how 5G can help the development of federated learning. Network slicing, another key feature of 5G, allows running multiple isolated networks on the same physical infrastructure. However, security remains the main concern in the context of virtualization, multitenancy and high device density. Formal verification of 5G networks can be applied to detect security issues in massive virtualized environments. In summary, 5G will make the world even more densely and closely connected. What we have experienced in 4G connectivity will pale in comparison to the vast amounts of possibilities engendered by 5G.

show abstract

Deep reinforcement learning for page-wise recommendations

Cited by 314 publications

References 31 publications

Reinforcement Learning to Optimize Long-term User Engagement in Recommender Systems

Reinforcement Learning to Optimize Long-term User Engagement in Recommender Systems

Jointly Learning Explainable Rules for Recommendation with Knowledge Graph

The Disruptions of 5G on Data-Driven Technologies and Applications

Contact Info

Product

Resources

About