Online Auctions and Multi-scale Online Learning

Bubeck, Sébastien; Devanur, Nikhil R.; Huang, Zhiyi; Niazadeh, Rad

doi:10.1145/3033274.3085145

Cited by 24 publications

(45 citation statements)

References 23 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…We present our result in a more general setting where the learner receives a predicted loss vector m t before deciding w t (Rakhlin and Sridharan, 2013b), and show a bound REG(e i ) = Õ (ln d) T t=1 (ℓ t,i − m t,i ) 2 simultaneously for all i (setting m t = 0 resolves the original impossible tuning issue). Using different m t , we achieve various regret bounds summarized in Table 1, which either recover the guarantees of existing algorithms such as (A, B)-PROD (Sani et al, 2014), ADAPT-ML-PROD (Gaillard et al, 2014), OPTIMISTIC-ADAPT-ML-PROD (Wei et al, 2016), or improve over existing variance/path-length/multi-scale bounds in (Steinhardt and Liang, 2014;Bubeck et al, 2017;Foster et al, 2017;Cutkosky and Orabona, 2018). Notably, we achieve a bound Õ (ln d) T t=1 w t − e i , ℓ t − ℓ t−1 2 which simultaneously ensures the "fast rate" consequences discussed in for stochastic settings and the path-length bound useful for fast convergence in games (Syrgkanis et al, 2015).…”

Section: Notesmentioning

confidence: 62%

“…Our first main contribution is to show that, perhaps surprisingly, this impossible tuning is in fact possible (up to an additional ln T factor), via an algorithm combining ideas that mostly appear before already. More concretely, we achieve this via Mirror Descent with a correction term similar to (Steinhardt and Liang, 2014) and a weighted negative entropy regularizer with different learning rates for each expert (and each round) similar to (Bubeck et al, 2017). Note that while natural, this algorithm has not been studied before, 1 and is not equivalent to using different learning rates for different experts in PROD or multiplicative-weight, as it does not admit a closed "proportional" form (and instead needs to be computed via a line search).…”

Section: Notesmentioning

confidence: 99%

“…Second, our regularizer ψ t = d i=1 1 η t,i w i ln w i is negative entropy with individual and timevarying learning rate η t,i for each expert i. For most applications, η t,i is the same for all t, in which case our regularizer is the same as that used in the MSMW algorithm of (Bubeck et al, 2017).…”

Section: An Algorithmic Frameworkmentioning

confidence: 99%

See 2 more Smart Citations

Impossible Tuning Made Possible: A New Expert Algorithm and Its Applications

Wei¹,

Luo²,

Chen³

2021

Preprint

View full text Add to dashboard Cite

We resolve the long-standing "impossible tuning" issue for the classic expert problem and show that, it is in fact possible to achieve regret Õ (ln d) T t=1 ℓ 2 t,i simultaneously for all expert i in a T -round d-expert problem where ℓ t,i is the loss for expert i in round t. Our algorithm is based on the Mirror Descent framework with a correction term and a weighted entropy regularizer. While natural, the algorithm has not been studied before and requires a careful analysis. We also generalize the bound to Õ (ln d)T t=1 (ℓ t,i − m t,i ) 2 for any prediction vector m t that the learner receives, and recover or improve many existing results by choosing different m t . Furthermore, we use the same framework to create a master algorithm that combines a set of base algorithms and learns the best one with little overhead. The new guarantee of our master allows us to derive many new results for both the expert problem and more generally Online Linear Optimization.

show abstract

Section: Notesmentioning

confidence: 62%

Section: Notesmentioning

confidence: 99%

See 1 more Smart Citation

Impossible Tuning Made Possible: A New Expert Algorithm and Its Applications

Wei¹,

Luo²,

Chen³

2021

Preprint

View full text Add to dashboard Cite

show abstract

“…where H t is a multiset of learning rates we define below. Similar regularizers have been used by Bubeck et al (2017) and Chen et al (2021) to obtain multiscale algorithms. In particular, the algorithm of Chen et al ( 2021) is related to ours, as they also use corrections.…”

Section: Full Information Settingmentioning

confidence: 93%

Nonstochastic Bandits and Experts with Arm-Dependent Delays

Hoeven¹,

Cesa-Bianchi²

2021

Preprint

View full text Add to dashboard Cite

We study nonstochastic bandits and experts in a delayed setting where delays depend on both time and arms. While the setting in which delays only depend on time has been extensively studied, the arm-dependent delay setting better captures real-world applications at the cost of introducing new technical challenges. In the full information (experts) setting, we design an algorithm with a firstorder regret bound that reveals an interesting trade-off between delays and losses. We prove a similar first-order regret bound also for the bandit setting, when the learner is allowed to observe how many losses are missing. These are the first bounds in the delayed setting that depend on the losses and delays of the best arm only. When in the bandit setting no information other than the losses is observed, we still manage to prove a regret bound through a modification to the algorithm of Zimmert and Seldin (2020). Our analyses hinge on a novel bound on the drift, measuring how much better an algorithm can perform when given a look-ahead of one round.

show abstract

“…input sequence. In particular this is true for a monopolistic seller learning an optimal price or an optimal auction Bubeck et al (2017); Blum and Hartline (2005). It is tempting to conjecture that the same holds for our setting as well, but we run into difficulties even modeling the problem.…”

Section: Future Workmentioning

confidence: 99%

Algorithmic Price Discrimination

Cummings¹,

Devanur²,

Huang³

et al. 2020

Proceedings of the Fourteenth Annual ACM-SIAM Symposium on Discrete Algorithms

Self Cite

View full text Add to dashboard Cite

We consider a generalization of the third degree price discrimination problem studied in Bergemann et al. (2015), where an intermediary between the buyer and the seller can design market segments to maximize any linear combination of consumer surplus and seller revenue. Unlike in Bergemann et al. (2015), we assume that the intermediary only has partial information about the buyer's value. We consider three different models of information, with increasing order of difficulty. In the first model, we assume that the intermediary's information allows him to construct a probability distribution of the buyer's value. Next we consider the sample complexity model, where we assume that the intermediary only sees samples from this distribution. Finally, we consider a bandit online learning model, where the intermediary can only observe past purchasing decisions of the buyer, rather than her exact value. For each of these models, we present algorithms to compute optimal or near optimal market segmentation. * Clearly, we need certain assumptions on the seller's behavior for any nontrivial result; there is not much we can do if the seller picks prices randomly all the time. Our assumptions can accommodate natural no regret learning algorithms on the seller side, including the Upper-Confidence-Bound (UCB) algorithm and the Explore-then-Commit (ETC) algorithm. Contributions to the Sample Complexity of Mechanism DesignPioneered by Balcan et al. (2005), Elkind (2007), and Dhangwatnotai et al. (2015), and formalized by Cole and Roughgarden (2014), the sample complexity of mechanism design, in particular, the revenue maximization problem, has been a focal point in algorithmic game theory in the last few years Morgenstern and Roughgarden (2015); Balcan et al. (2016); Devanur et al. (2016); Morgenstern and Roughgarden (2016); Hartline and Taggart (2019); Cai and Daskalakis (2017); Gonczarowski and Nisan (2017); Gonczarowski and Weinberg (2018); Huang et al. (2018b); Guo et al. (2019).This paper adds to the literature of sample complexity of mechanism design in two-folds. The first one is conceptual: we formulate the first sample complexity problem from the viewpoint of an intermediary rather than the seller, and for the task of designing information dispersion rather than allocations and payments. We show impossibility results for the general case and, more importantly, identify sufficient conditions under which we derive positive algorithmic results.Conceptually new models often lead to new technical challenges. Our second contribution is an algorithmic ingredient that tackles such a new challenge. Let us start with a thought experiment: consider a more powerful intermediary who knows the true distributions; the seller, however, still acts according to some beliefs formed from the observed samples. Does the problem become trivial? Can the intermediary simply run the optimal segmentation w.r.t. the true distributions and expect near optimal outcomes?The answers turn out to be negative. Consider a segment for which there are two price...

show abstract

Online Auctions and Multi-scale Online Learning

Cited by 24 publications

References 23 publications

Impossible Tuning Made Possible: A New Expert Algorithm and Its Applications

Impossible Tuning Made Possible: A New Expert Algorithm and Its Applications

Nonstochastic Bandits and Experts with Arm-Dependent Delays

Algorithmic Price Discrimination

Contact Info

Product

Resources

About