2014
DOI: 10.1017/s0269964814000217
|View full text |Cite
|
Sign up to set email alerts
|

Multi-Armed Bandits Under General Depreciation and Commitment

Abstract: Generally, the multi-armed has been studied under the setting that at each time step over an infinite horizon a controller chooses to activate a single process or bandit out of a finite collection of independent processes (statistical experiments, populations, etc.) for a single period, receiving a reward that is a function of the activated process, and in doing so advancing the chosen process. Classically, rewards are discounted by a constant factor β ∈ (0, 1) per round.In this paper, we present a solution to… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
13
0

Year Published

2015
2015
2023
2023

Publication Types

Select...
6
2

Relationship

4
4

Authors

Journals

citations
Cited by 16 publications
(15 citation statements)
references
References 53 publications
2
13
0
Order By: Relevance
“…For other work in this area we refer to Katehakis and Derman [30], Katehakis and Veinott Jr [32], Burnetas and Katehakis [8], Burnetas and Katehakis [9], Lagoudakis and Parr [35], Bartlett and Tewari [5], Tekin and Liu [44], Jouini et al [29], Dayanik, Powell, and Yamazaki [20], Filippi, Cappé, and Garivier [24], Osband and Van Roy [41]. As well as Burnetas and Katehakis [13], Audibert et al [1], Auer and Ortner [3], Gittins, Glazebrook, and Weber [25], Bubeck and Slivkins [6], Cappé et al [15], Kaufmann [33], Li, Munos, and Szepesvári [38], Cowan and Katehakis [17], Cowan and Katehakis [19], and references therein. For dynamic programming extensions we refer to Burnetas and Katehakis [11], Butenko, Pardalos, and Murphey [14], Tewari and Bartlett [45], Audibert et al [1], Littman [39], Feinberg, Kasyanov, and Zgurovsky [22] and references therein.…”
Section: (13)mentioning
confidence: 99%
“…For other work in this area we refer to Katehakis and Derman [30], Katehakis and Veinott Jr [32], Burnetas and Katehakis [8], Burnetas and Katehakis [9], Lagoudakis and Parr [35], Bartlett and Tewari [5], Tekin and Liu [44], Jouini et al [29], Dayanik, Powell, and Yamazaki [20], Filippi, Cappé, and Garivier [24], Osband and Van Roy [41]. As well as Burnetas and Katehakis [13], Audibert et al [1], Auer and Ortner [3], Gittins, Glazebrook, and Weber [25], Bubeck and Slivkins [6], Cappé et al [15], Kaufmann [33], Li, Munos, and Szepesvári [38], Cowan and Katehakis [17], Cowan and Katehakis [19], and references therein. For dynamic programming extensions we refer to Burnetas and Katehakis [11], Butenko, Pardalos, and Murphey [14], Tewari and Bartlett [45], Audibert et al [1], Littman [39], Feinberg, Kasyanov, and Zgurovsky [22] and references therein.…”
Section: (13)mentioning
confidence: 99%
“…This effect does not influence the long-term almost sure behavior of these policies. For other significant related recent work, we refer to Garivier et al [9], Lattimore [14], Ortner [16], Orabona and Pál [15], Cowan and Katehakis [5][6][7].…”
Section: Related Literaturementioning
confidence: 99%
“…Hence, we can invoke the restart problem introduced in Katehakis and Veinott, Jr. [17] and Cowan and Katehakis [8] to compute the robust indices. 1 Indeed, one can show that for a fixed initial…”
Section: Proposition 2 the Robust Gittins Index Is Given Bymentioning
confidence: 99%