Online Learning to Rank for Information Retrieval

Grotov, Artem; Rijke, Maarten de

doi:10.1145/2911451.2914798

Cited by 45 publications

(27 citation statements)

References 34 publications

(29 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…This problem is usually formalized as a multi-armed bandit problem [63] or a contextual bandit problem [53]. Both views are extensively covered in recent tutorials [32,33,61], which discuss such problems as dueling bandit gradient descent [77] and the exploration vs. exploitation trade-off [38]. Lattimore and Szepesvári [52,Chapter 32] present a theoretical framework for using bandit algorithms for IR, and highlight unique challenges and ways to address them in the online setting.…”

Section: Scoring and Rankingmentioning

confidence: 99%

“…Ai et al [3] Unbiased learning to rank Arguello [5] Aggregated search Barocas and Hardt [8] Fairness in machine learning Bast et al [9] Semantic search, knowledge graphs Budylin et al [12,13] Online evaluation Burges [14] Learning to rank Cai and de Rijke [16] Query auto-completion Cambazoglu and Baeza-Yates [17] Infrastructure Chuklin et al [21,22,23,24] Click models Crestani et al [26] Mobile information retrieval Gao et al [30] Conversational search Glowacka [32] Bandit algorithms Grotov and de Rijke [33] Online learning to rank Hajian et al [35] Algorithmic bias Hofmann et al [40] Online evaluation Hui Yang and Zhang [41] Differential privacy in information retrieval Joachims and Swaminathan [44] Counterfactual evaluation and learning Jones [45] Mobile search Kanoulas [46] Online and offline evaluation Kelly [47] User studies Kenter et al [48] Neural methods in information retrieval Knijnenburg and Berkovsky [49] Privacy in recommender systems Lalmas [51] XML retrieval Lattimore and Szepesvári [52] Bandit algorithms Liu [55] Offline learning to rank Mehrotra et al [57] Task understanding Mitra and Craswell [58] Neural methods in information retrieval Onal et al [60] Neural methods in information retrieval Oosterhuis [61] Online evaluation and ranking Ren et al [66] E-commerce Sakai [67] Experimental design and methodology Santos et al [68] Diversification Silvestri…”

Section: Author(s) Topicmentioning

confidence: 99%

See 1 more Smart Citation

What Should We Teach in Information Retrieval?

Markov¹,

Rijke

2019

SIGIR Forum

Self Cite

View full text Add to dashboard Cite

Modern Information Retrieval (IR) systems, such as search engines, recommender systems, and conversational agents, are best thought of as interactive systems. And their development is best thought of as a two-stage development process: offline development followed by continued online adaptation and development based on interactions with users. In this opinion paper, we take a closer look at existing IR textbooks and teaching materials, and examine to which degree they cover the offline and online stages of the IR system development process. We notice that current teaching materials in IR focus mostly on search and on the offline development phase. Other scenarios of interacting with information are largely absent from current IR teaching materials, as is the (interactive) online development phase. We identify a list of scenarios and a list of topics that we believe are essential to any modern set of IR teaching materials that claims to fully cover IR system development. In particular, we argue for more attention, in basic IR teaching materials, to scenarios such as recommender systems, and to topics such as query and interaction mining and understanding, online evaluation, and online learning to rank.

show abstract

Section: Scoring and Rankingmentioning

confidence: 99%

Section: Author(s) Topicmentioning

confidence: 99%

What Should We Teach in Information Retrieval?

Markov¹,

Rijke

2019

SIGIR Forum

Self Cite

View full text Add to dashboard Cite

show abstract

“…It is difficult to evaluate the effectiveness of online and reinforcement learning algorithms for information systems in a live setting with real users because it requires a very long time and a large amount of resources [30,31,51,58,63]. Thus, most studies in this area use purely simulated user interactions [31,51,58].…”

Section: Poissonmentioning

confidence: 99%

“…For example, they maintain that the user picks queries to express an intent according to a fixed probability distribution. It is known that the learning methods that are useful in a static setting do not deliver desired outcomes in a setting where all agents may modify their strategies [18,30]. Hence, one may not be able to use current techniques to help the DBMS understand the users' information need in a rather long-term interaction.…”

Section: Introductionmentioning

confidence: 99%

A Signaling Game Approach to Databases Querying and Interaction

Termehchy

Touri

2015

Proceedings of the 2015 International Conference on the Theory of Information Retrieval

View full text Add to dashboard Cite

As most users do not precisely know the structure and/or the content of databases, their queries do not exactly reflect their information needs. The database management systems (DBMS) may interact with users and use their feedback on the returned results to learn the information needs behind their queries. Current query interfaces assume that users do not learn and modify the way way they express their information needs in form of queries during their interaction with the DBMS. Using a real-world interaction workload, we show that users learn and modify how to express their information needs during their interactions with the DBMS and their learning is accurately modeled by a well-known reinforcement learning mechanism. As current data interaction systems assume that users do not modify their strategies, they cannot discover the information needs behind users' queries effectively. We model the interaction between users and DBMS as a game with identical interest between two rational agents whose goal is to establish a common language for representing information needs in form of queries. We propose a reinforcement learning method that learns and answers the information needs behind queries and adapts to the changes in users' strategies and prove that it improves the effectiveness of answering queries stochastically speaking. We propose two efficient implementation of this method over large relational databases. Our extensive empirical studies over real-world query workloads indicate that our algorithms are efficient and effective.

show abstract

“…This property enables the experience replay update used in DQN. Third, we propose to apply a Dueling Bandit Gradient Descent (DBGD) method [16,17,49] for exploration, by choosing random item candidates in the neighborhood of the current recommender. This exploration strategy can avoid recommending totally unrelated items and hence maintain better recommendation accuracy.…”

Section: Introductionmentioning

confidence: 99%

DRN

Zheng

Zhang

Zheng

et al. 2018

Proceedings of the 2018 World Wide Web Conference on World Wide Web - WWW '18

448

View full text Add to dashboard Cite

In this paper, we propose a novel Deep Reinforcement Learning framework for news recommendation. Online personalized news recommendation is a highly challenging problem due to the dynamic nature of news features and user preferences. Although some online recommendation models have been proposed to address the dynamic nature of news recommendation, these methods have three major issues. First, they only try to model current reward (e.g., Click Through Rate). Second, very few studies consider to use user feedback other than click / no click labels (e.g., how frequent user returns) to help improve recommendation. Third, these methods tend to keep recommending similar news to users, which may cause users to get bored. Therefore, to address the aforementioned challenges, we propose a Deep Q-Learning based recommendation framework, which can model future reward explicitly. We further consider user return pattern as a supplement to click / no click label in order to capture more user feedback information. In addition, an effective exploration strategy is incorporated to find new attractive news for users. Extensive experiments are conducted on the offline dataset and online production environment of a commercial news recommendation application and have shown the superior performance of our methods.

show abstract

Online Learning to Rank for Information Retrieval

Cited by 45 publications

References 34 publications

What Should We Teach in Information Retrieval?

What Should We Teach in Information Retrieval?

A Signaling Game Approach to Databases Querying and Interaction

DRN

Contact Info

Product

Resources

About