Learning to Rank (LTR) technique is ubiquitous in the Information Retrieval system nowadays, especially in the Search Ranking application. The query-item relevance labels typically used to train the ranking model are often noisy measurements of human behavior, e.g., product rating for product search. The coarse measurements make the ground truth ranking non-unique with respect to a single relevance criterion. To resolve ambiguity, it is desirable to train a model using many relevance criteria, giving rise to Multi-Label LTR (MLLTR). Moreover, it formulates multiple goals that may be conflicting yet important to optimize for simultaneously, e.g., in product search, a ranking model can be trained based on product quality and purchase likelihood to increase revenue. In this research, we leverage the Multi-Objective Optimization (MOO) aspect of the MLLTR problem and employ recently developed MOO algorithms to solve it. Specifically, we propose a general framework where the information from labels can be combined in a variety of ways to meaningfully characterize the trade-off among the goals. Our framework allows for any gradient based MOO algorithm to be used for solving the MLLTR problem. We test the proposed framework on two publicly available LTR datasets and one e-commerce dataset to show its efficacy.
IntroductionResearch in Learning to Rank (LTR) has exploded in the last decade. It can be attributed to the increasing availability of labeled data for query-item relevance, either through manually labeling or tracking user behavior. In LTR, a scoring function is trained to score the retrieved items for ranking. Originally, LTR was developed to use only one relevance criterion for training. However, owing to the limitations of such uni-dimensional approach, e.g., subjectivity and noise in relevance articulation, inability to incorporate multiple goals, a multi-dimensional approach for relevance is adopted in Multi-Label Learning to Rank (MLLTR) [1]. The multidimensional aspect of MLLTR poses a fundamental challenge: different relevance criteria can be conflicting. For example, in web search, two conflicting criteria could be considering the user-history and increasing serendipitous items in the top results. Due to this conflict, it is virtually infeasible to find a scoring function that simultaneously optimizes for all relevance criteria, thus requires a trade-off among them.