Ensemble contextual bandits for personalized recommendation

Tang, Liang; Jiang, Yexi; Li, Lei; Li, Tao

doi:10.1145/2645710.2645732

Cited by 69 publications

(47 citation statements)

References 22 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Tang et al [225] propose a context-aware recommender system, implemented as a contextual multi-armed bandits problem. Although the authors report extensive offline evaluation (log based and simulation based) with acceptable CTR, no comparison is made from a cold-start problem standpoint.…”

Section: Cold Start Problemmentioning

confidence: 99%

Using Contextual Information to Understand Searching and Browsing Behavior

Kiseleva

2015

Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval

View full text Add to dashboard Cite

Section: Cold Start Problemmentioning

confidence: 99%

Using Contextual Information to Understand Searching and Browsing Behavior

Kiseleva

2015

Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval

View full text Add to dashboard Cite

“…The resulting cumulative reward is considered "the unique good metric to evaluate the recommendation algorithm" [74]. It is common to extend this setting to a contextual bandit problem [117,37,192] [114]. This approach is known as ranked bandits.…”

Section: Multi-arm Banditsmentioning

confidence: 99%

“…This type of MAB is called multiple-play bandit and has been studied in [109,123]. Another emerging approach is based on ensemble learning, where the bandit algorithm decides which recommendation model to choose for filling the slots in the top-N ranking via exploring the potential of untested models and exploiting the predictive power of the already tested ones [192,62,31].…”

Section: Multi-arm Banditsmentioning

confidence: 99%

“…The success of these approaches has been witnessed in recommendation competitions such as the Netflix Prize contest or KDD cups [192].…”

Section: Hybrid Filteringmentioning

confidence: 99%

“…The idea behind this method is to train a number of component recommenders, whose individual outputs are aggregated for the final prediction. The aggregation is typically done by taking the average or the (weighted) majority vote [192]. This allows to reduce the variance of learning algorithms on unseen data.…”

Section: Hybrid Filteringmentioning

confidence: 99%

See 2 more Smart Citations

Algorithmic and Ethical Aspects of Recommender Systems in e-Commerce

Paraschakis¹

View full text Add to dashboard Cite

In memory of a beautiful soul ABSTRACTRecommender systems have become an integral part of virtually every e-commerce application on the web. The deployment of these expert systems has enabled users to quickly discover the products or services they need, at the same time increasing business revenues through better customer conversion. Remaining a very active research field since the mid-2000s, recommender systems have been modeled using a plethora of machine learning techniques. However, the adoptability of these models by industrial e-commerce platforms remains unclear.In this thesis, we assess the receptiveness of industrial platforms to algorithmic contributions of the research community by surveying more than 30 popular shopping cart solutions, and experimenting with various recommendation algorithms on proprietary e-commerce datasets.Another overlooked but important factor that complicates the design and use of recommender systems is their ethical implications. We provide a holistic view of these issues and summarize them in our ethical recommendation framework. This framework suggests new paradigm of ethics-awareness by design, and enables users to control sensitive moral aspects of recommendations via the proposed "ethical toolbox". The feasibility of this tool is supported by the results of our user study.Since the large part of moral implications stems from user profiling, we investigate algorithms capable of generating useragnostic recommendations based solely on a visited product page. We propose an ensemble learning scheme based on Thompson Sampling bandit policy, which models arms as base recommendation functions. We show how to adapt this algorithm to realistic situations when neither arm availability nor reward stationarity is guaranteed.

show abstract