2012
DOI: 10.1016/j.jcss.2011.12.027
|View full text |Cite
|
Sign up to set email alerts
|

Learning with stochastic inputs and adversarial outputs

Abstract: Most of the research in online learning is focused either on the problem of adversarial classification (i.e., both inputs and labels are arbitrarily chosen by an adversary) or on the traditional supervised learning problem in which samples are independent and identically distributed according to a stationary probability distribution. Nonetheless, in a number of domains the relationship between inputs and outputs may be adversarial, whereas input instances are i.i.d. from a stationary distribution (e.g., user p… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
12
0

Year Published

2014
2014
2022
2022

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(12 citation statements)
references
References 29 publications
0
12
0
Order By: Relevance
“…Thus, it remains to bound the expectation of the other two terms. This is done in the following two propositions whose proofs are deferred to the next subsections: 12 Lemma 5.2 of Even-Dar et al [7] gives a bound on νt − µ π t st 1 with a slightly different technique. However, there are multiple mistakes in the proof.…”
Section: (21)mentioning
confidence: 99%
See 1 more Smart Citation
“…Thus, it remains to bound the expectation of the other two terms. This is done in the following two propositions whose proofs are deferred to the next subsections: 12 Lemma 5.2 of Even-Dar et al [7] gives a bound on νt − µ π t st 1 with a slightly different technique. However, there are multiple mistakes in the proof.…”
Section: (21)mentioning
confidence: 99%
“…Finally, we note in passing that the contextual bandit problem considered by Lazaric and Munos [12] can also be regarded as a simplified version of our online learning problem where the states are generated in an i.i.d. fashion (though we do not consider the problem of competing with the best policy in a restricted subset of stationary policies).…”
Section: Introductionmentioning
confidence: 99%
“…In this scenario, we do not have access to another sequence X n . However, at time step i + 1, we have access to the past sequence X i which could be used, as done in [10], in lieu of X i . We now precisely define and analyze this probability assignment.…”
Section: Epoch-based Mixture Probabilitymentioning
confidence: 99%
“…In particular, VCdim(G) < ∞ implies the PAClearnability of the hypothesis class G. Viewing the current setting as a log-loss variant of the standard classification problem studied in statistical learning (which uses the indicator loss) motivates the choice of constraint VCdim(G) < ∞. A variant of the current problem with indicator loss instead of log-loss was studied in [10]. We have considered a specific class of conditional distributions to compete against ( recall that under hypothesis f we have p f (Y = 0|X = x) = Bern(θ g(x) )).…”
Section: Introductionmentioning
confidence: 99%
“…Since we aim to train these models in interaction with real users, we focus on the ease of elicitability of the feedback and on speed of convergence. In the spectrum of stochastic versus adversarial bandits, our approach is semi-adversarial in making stochastic assumptions on inputs, but not on rewards [12].…”
Section: Related Workmentioning
confidence: 99%