Sponsored search auctions constitute one of the most successful applications of microeconomic mechanisms. In mechanism design, auctions are usually designed to incentivize advertisers to bid their truthful valuations and, at the same time, to assure both the advertisers and the auctioneer a non-negative utility. Nonetheless, in sponsored search auctions, the click-through-rates (CTRs) of the advertisers are often unknown to the auctioneer and thus standard incentive compatible mechanisms cannot be directly applied and must be paired with an effective learning algorithm for the estimation of the CTRs. This introduces the critical problem of designing a learning mechanism able to estimate the CTRs as the same time as implementing a truthful mechanism with a revenue loss as small as possible compared to an optimal mechanism designed with the true CTRs. Previous works showed that in single-slot auctions the problem can be solved using a suitable exploration-exploitation mechanism able to achieve a per-step regret of order O(T −1/3 ) (where T is the number of times the auction is repeated). In this paper we extend these results to the general case of contextual multi-slot auctions with position-and ad-dependent externalities. In particular, we prove novel upper-bounds on the revenue loss w.r.t. to a VCG auction and we report numerical simulations investigating their accuracy in predicting the dependency of the regret on the number of rounds T , the number of slots K, and the number of advertisements n.
Multi-Armed Bandit (MAB) techniques have been successfully applied to many classes of sequential decision problems in the past decades. However, non-stationary settings -- very common in real-world applications -- received little attention so far, and theoretical guarantees on the regret are known only for some frequentist algorithms. In this paper, we propose an algorithm, namely Sliding-Window Thompson Sampling (SW-TS), for nonstationary stochastic MAB settings. Our algorithm is based on Thompson Sampling and exploits a sliding-window approach to tackle, in a unified fashion, two different forms of non-stationarity studied separately so far: abruptly changing and smoothly changing. In the former, the reward distributions are constant during sequences of rounds, and their change may be arbitrary and happen at unknown rounds, while, in the latter, the reward distributions smoothly evolve over rounds according to unknown dynamics. Under mild assumptions, we provide regret upper bounds on the dynamic pseudo-regret of SW-TS for the abruptly changing environment, for the smoothly changing one, and for the setting in which both the non-stationarity forms are present. Furthermore, we empirically show that SW-TS dramatically outperforms state-of-the-art algorithms even when the forms of non-stationarity are taken separately, as previously studied in the literature.
Sponsored Search Auctions (SSAs) constitute one of the most successful applications of microeconomic mechanisms. In mechanism design, auctions are usually designed to incentivize advertisers to bid their truthful valuations and, at the same time, to guarantee both the advertisers and the auctioneer a non-negative utility. Nonetheless, in sponsored search auctions, the Click-Through-Rates (CTRs) of the advertisers are often unknown to the auctioneer and thus standard truthful mechanisms cannot be directly applied and must be paired with an effective learning algorithm for the estimation of the CTRs. This introduces the critical problem of designing a learning mechanism able to estimate the CTRs at the same time as implementing a truthful mechanism with a revenue loss as small as possible compared to the mechanism that can exploit the true CTRs. Previous work showed that, when dominant-strategy truthfulness is adopted, in single-slot auctions the problem can be solved using suitable exploration-exploitation mechanisms able to achieve a cumulative regret (on the auctioneer's revenue) of order O (T 2 3 ), where T is the number of times the auction is repeated. It is also known that, when truthfulness in expectation is adopted, a cumulative regret (over the social welfare) of order O (T 1 2 ) can be obtained. In this paper we extend the results available in the literature to the more realistic case of multi-slot auctions. In this case, a model of the user is needed to characterize how the CTR of an ad changes as its position in the allocation changes. In particular, we adopt the cascade model, one of the most popular models for sponsored search auctions, and we prove a number of novel upper bounds and lower bounds on both auctioneer's revenue loss and social welfare w.r.t. the Vickrey-Clarke-Groves (VCG) auction. Furthermore, we report numerical simulations investigating the accuracy of the bounds in predicting the dependency of the regret on the auction parameters.
(1) Background: In advanced non-small cell lung cancer (aNSCLC), programmed death ligand 1 (PD-L1) remains the only biomarker for candidate patients to immunotherapy (IO). This study aimed at using artificial intelligence (AI) and machine learning (ML) tools to improve response and efficacy predictions in aNSCLC patients treated with IO. (2) Methods: Real world data and the blood microRNA signature classifier (MSC) were used. Patients were divided into responders (R) and non-responders (NR) to determine if the overall survival of the patients was likely to be shorter or longer than 24 months from baseline IO. (3) Results: One-hundred sixty-four out of 200 patients (i.e., only those ones with PD-L1 data available) were considered in the model, 73 (44.5%) were R and 91 (55.5%) NR. Overall, the best model was the linear regression (RL) and included 5 features. The model predicting R/NR of patients achieved accuracy ACC = 0.756, F1 score F1 = 0.722, and area under the ROC curve AUC = 0.82. LR was also the best-performing model in predicting patients with long survival (24 months OS), achieving ACC = 0.839, F1 = 0.908, and AUC = 0.87. (4) Conclusions: The results suggest that the integration of multifactorial data provided by ML techniques is a useful tool to select NSCLC patients as candidates for IO.
Pay-per-click advertising includes various formats (e.g., search, contextual, and social) with a total investment of more than 140 billion USD per year. An advertising campaign is composed of some subcampaigns-each with a different ad-and a cumulative daily budget. The allocation of the ads is ruled exploiting auction mechanisms. In this paper, we propose, for the first time to the best of our knowledge, an algorithm for the online joint bid/budget optimization of pay-per-click multi-channel advertising campaigns. We formulate the optimization problem as a combinatorial bandit problem, in which we use Gaussian Processes to estimate stochastic functions, Bayesian bandit techniques to address the exploration/exploitation problem, and a dynamic programming technique to solve a variation of the Multiple-Choice Knapsack problem. We experimentally evaluate our algorithm both in simulation-using a synthetic setting generated from real data from Yahoo!-and in a real-world application over an advertising period of two months.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.