2016
DOI: 10.1016/j.tcs.2016.07.033
|View full text |Cite
|
Sign up to set email alerts
|

Bandit online optimization over the permutahedron

Abstract: The permutahedron is the convex polytope with vertex set consisting of the vectors (π(1), . . . , π(n)) for all permutations (bijections) π over {1, . . . , n}. We study a bandit game in which, at each step t, an adversary chooses a hidden weight weight vector st, a player chooses a vertex πt of the permutahedron and suffers an observed instantaneous loss ofWe study the problem in two regimes. In the first regime, st is a point in the polytope dual to the permutahedron. Algorithm CombBand of Cesa-Bianchi et al… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
13
0
1

Year Published

2018
2018
2022
2022

Publication Types

Select...
4
2
1

Relationship

1
6

Authors

Journals

citations
Cited by 9 publications
(14 citation statements)
references
References 13 publications
0
13
0
1
Order By: Relevance
“…This is not surprising, as these choices do not produce valid permutation matrices. • Using a squared loss 1 2 ϕ(y) − θ 2 (C = R k×k , no projection) works relatively well when combined with permutation decoding. Using supersets of the Birkhoff polytope as projection set C, such as [0, 1] k×k or k×k , improves accuracy substantially.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…This is not surprising, as these choices do not produce valid permutation matrices. • Using a squared loss 1 2 ϕ(y) − θ 2 (C = R k×k , no projection) works relatively well when combined with permutation decoding. Using supersets of the Birkhoff polytope as projection set C, such as [0, 1] k×k or k×k , improves accuracy substantially.…”
Section: Resultsmentioning
confidence: 99%
“…The structured perceptron, hinge and CRF losses are generally not consistent when using MAP as decoder d [38]. Inspired by kernel dependency estimation [55,17,25], several works [15,26,31] showed good empirical results and proved consistency by combining a squared loss S sq (θ, y) := 1 2 ϕ(y) − θ 2 2 with calibrated decoding (no oracle is needed during training). A drawback of this loss, however, is that it does not make use of the output space Y during training, ignoring precious structural information.…”
Section: Background and Related Workmentioning
confidence: 99%
“…It has been used for specific convex polytopes, most importantly in the optimal transport literature (Cuturi, 2013;Peyré and Cuturi, 2017) but also for learning to predict permutation matrices (Helmbold and Warmuth, 2009) or permutations (Yasutake et al, 2011;Ailon et al, 2016). The mean regularization counterpart of sparsemax is known as SparseMAP (Niculae et al, 2018):…”
Section: Mean Regularization and Sparsemapmentioning
confidence: 99%
“…Permutahedra have been used to derive online learning to rank algorithms (Yasutake et al, 2011;Ailon et al, 2016) but it is not obvious how to extract a loss from these works. Ordered weighted averaging (OWA) operators have been used to define related top-k multiclass losses (Usunier et al, 2009;Lapin et al, 2015) but without identifying the connection We set θ = y and inspect how the loss changes when varying each θ i .…”
Section: Examplesmentioning
confidence: 99%
“…Certains algorithmes de bandits proposent de tirer les bras de manière ordonnée selon l'espérance estimée des bras par rapport à un utilisateur. Ailon et al (2014) propose l'algorithme BanditRank pour aborder la problématique d'ordonnancement. Pour cela l'algorithme soumet à chaque instant une permutation de l'ensemble des bras disponibles, dont l'objectif principal est de trier les bras par ordre de pertinence en fonction des récompenses obtenues.…”
Section: Stratégies Contextuellesunclassified