2020
DOI: 10.1609/aaai.v34i04.5821
|View full text |Cite
|
Sign up to set email alerts
|

Robust Stochastic Bandit Algorithms under Probabilistic Unbounded Adversarial Attack

Abstract: The multi-armed bandit formalism has been extensively studied under various attack models, in which an adversary can modify the reward revealed to the player. Previous studies focused on scenarios where the attack value either is bounded at each round or has a vanishing probability of occurrence. These models do not capture powerful adversaries that can catastrophically perturb the revealed reward. This paper investigates the attack model where an adversary attacks with a certain probability at each round, and… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
17
0
1

Year Published

2020
2020
2023
2023

Publication Types

Select...
7
2
1

Relationship

1
9

Authors

Journals

citations
Cited by 22 publications
(18 citation statements)
references
References 9 publications
0
17
0
1
Order By: Relevance
“…Suppose a mafia designer hopes to force 𝑎 † =(mum, mum) by sabotaging the losses. Note that ∀𝑖, ℓ 𝑜 𝑖 (𝑎 † ) = 2, which lies in the interior of the loss range L = [1,5]. Therefore, this is again an interior design scenario, and the designer can apply Algorithm 1.…”
Section: Prisoner's Dilemma (Pd)mentioning
confidence: 99%
“…Suppose a mafia designer hopes to force 𝑎 † =(mum, mum) by sabotaging the losses. Note that ∀𝑖, ℓ 𝑜 𝑖 (𝑎 † ) = 2, which lies in the interior of the loss range L = [1,5]. Therefore, this is again an interior design scenario, and the designer can apply Algorithm 1.…”
Section: Prisoner's Dilemma (Pd)mentioning
confidence: 99%
“…To provide a neat version of thesis with closely correlated topics, this thesis does not include all of the author's works. We briefly talk about some representatives of the author's other research works [56,64,59,123,61,60,125,101,116,47,130,129,133] as follows.…”
Section: Other Phd Researchmentioning
confidence: 99%
“…[27] presents an algorithm named BARBAR that is robust to reward poisoning attacks and the regret of the proposed algorithm is nearly optimal. [28] considers a reward poisoning attack model where an adversary attacks with a certain probability at each round. As its attack value at each round can be arbitrary and unbounded, the attack model could be powerful.…”
Section: Related Workmentioning
confidence: 99%