Martin Bompaire scite author profile

Martin Bompaire

5Publications

19Citation Statements Received

99Citation Statements Given

How they've been cited

How they cite others

Affiliations

Criteo (France)

Publications

Order By: Most citations

Joint Policy-Value Learning for Recommendation

Jeunen

Rohde

Vasile

et al. 2020

View full text Add to dashboard Cite

Conventional approaches to recommendation often do not explicitly take into account information on previously shown recommendations and their recorded responses. One reason is that, since we do not know the outcome of actions the system did not take, learning directly from such logs is not a straightforward task. Several methods for off-policy or counterfactual learning have been proposed in recent years, but their efficacy for the recommendation task remains understudied. Due to the limitations of offline datasets and the lack of access of most academic researchers to online experiments, this is a non-trivial task. Simulation environments can provide a reproducible solution to this problem.In this work, we conduct the first broad empirical study of counterfactual learning methods for recommendation, in a simulated environment. We consider various different policy-based methods that make use of the Inverse Propensity Score (IPS) to perform Counterfactual Risk Minimisation (CRM), as well as value-based methods based on Maximum Likelihood Estimation (MLE). We highlight how existing off-policy learning methods fail due to stochastic and sparse rewards, and show how a logarithmic variant of the traditional IPS estimator can solve these issues, whilst convexifying the objective and thus facilitating its optimisation. Additionally, under certain assumptions the value-and policy-based methods have an identical parameterisation, allowing us to propose a new model that combines both the MLE and CRM objectives. Extensive experiments show that this łDual Banditž approach achieves stateof-the-art performance in a wide range of scenarios, for varying logging policies, action spaces and training sample sizes.

show abstract

Dual optimization for convex constrained objectives without the gradient-Lipschitz assumption

Bompaire¹,

Bacry²,

Gaı̈ffas³

2018

Preprint

View full text Add to dashboard Cite

The minimization of convex objectives coming from linear supervised learning problems, such as penalized generalized linear models, can be formulated as finite sums of convex functions. For such problems, a large set of stochastic first-order solvers based on the idea of variance reduction are available and combine both computational efficiency and sound theoretical guarantees (linear convergence rates) [19], [35], [36], [13]. Such rates are obtained under both gradient-Lipschitz and strong convexity assumptions. Motivated by learning problems that do not meet the gradient-Lipschitz assumption, such as linear Poisson regression, we work under another smoothness assumption, and obtain a linear convergence rate for a shifted version of Stochastic Dual Coordinate Ascent (SDCA) [36] that improves the current state-of-the-art. Our motivation for considering a solver working on the Fenchel-dual problem comes from the fact that such objectives include many linear constraints, that are easier to deal with in the dual. Our approach and theoretical findings are validated on several datasets, for Poisson regression and another objective coming from the negative log-likelihood of the Hawkes process, which is a family of models which proves extremely useful for the modeling of information propagation in social networks and causality inference [12], [14].

show abstract

Causal Models for Real Time Bidding with Repeated User Interactions

Bompaire

Gilotte

Heymann

2021

View full text Add to dashboard Cite

A large portion of online advertising displays are sold through an auction mechanism called Real Time Bidding (RTB). Each auction corresponds to a display opportunity, for which the competing advertisers need to precisely estimate the economical value in order to bid accordingly. This estimate is typically taken as the advertiser's payoff for the target event -such as a purchase on the merchant website attributed to this display -times this event estimated probability. However, this greedy approach is too naive when several displays are shown to the same user. The purpose of the present paper is to discuss how such an estimation should be made when a user has already been shown one or more displays. Intuitively, while a user is more likely to make a purchase if the number of displays increases, the marginal effect of each display is expected to be decreasing. In this work, we first frame this bidding problem with repeated user interactions by using causal models to value each display individually. Then, based on this approach, we introduce a simple rule to improve the value estimate. This change shows both interesting qualitative properties that follow our previous intuition as well as quantitative improvements on a public data set and online in a production environment.

show abstract

Offline Evaluation of Reward-Optimizing Recommender Systems: The Case of Simulation

Aouali¹,

Benhalloum²,

Bompaire³

et al. 2022

Preprint

View full text Add to dashboard Cite

Robust label attribution for real-time bidding

Bompaire¹,

Désir²,

Heymann³

2020

Preprint

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Martin Bompaire

Joint Policy-Value Learning for Recommendation

Dual optimization for convex constrained objectives without the gradient-Lipschitz assumption

Causal Models for Real Time Bidding with Repeated User Interactions

Offline Evaluation of Reward-Optimizing Recommender Systems: The Case of Simulation

Robust label attribution for real-time bidding

Contact Info

Product

Resources

About