The platform will undergo maintenance on Sep 14 at about 7:45 AM EST and will be unavailable for approximately 2 hours.
2010
DOI: 10.1007/s10994-010-5225-4
|View full text |Cite
|
Sign up to set email alerts
|

Knows what it knows: a framework for self-aware learning

Abstract: We introduce a learning framework that combines elements of the well-known PAC and mistake-bound models. The KWIK (knows what it knows) framework was designed particularly for its utility in learning settings where active exploration can impact the training examples the learner is exposed to, as is true in reinforcement-learning and active-learning problems. We catalog several KWIK-learnable classes and open problems. MotivationAt the core of recent reinforcement-learning algorithms that have polynomial sample… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
142
0

Year Published

2010
2010
2017
2017

Publication Types

Select...
5
2
1

Relationship

1
7

Authors

Journals

citations
Cited by 111 publications
(144 citation statements)
references
References 32 publications
2
142
0
Order By: Relevance
“…Furthermore, if sup (x,a,t) |R t (x, a)| ≤ R, then for any T , Regret(T ) ≤ 2RKH 2 . Notice that this result can also be derived based on the KWIK online regression with deterministic linear functions (see [17]). …”
Section: Learning With a Coherent Hypothesis Classmentioning
confidence: 65%
“…Furthermore, if sup (x,a,t) |R t (x, a)| ≤ R, then for any T , Regret(T ) ≤ 2RKH 2 . Notice that this result can also be derived based on the KWIK online regression with deterministic linear functions (see [17]). …”
Section: Learning With a Coherent Hypothesis Classmentioning
confidence: 65%
“…A first case generalizes the first example for the case where each arm provides an object among k classes. For this we can model each arm as a multinomial distribution and we know that we can learn the task by making a number of queries bounded by [37]: B( , δ) = n 8 2 ln 2n δ . A more complex example is when each model is itself a reinforcement learning problem.…”
Section: A Submodular Costsmentioning
confidence: 99%
“…A more complex example is when each model is itself a reinforcement learning problem. The learning curve has been shown to be polynomial ( [18], [37]). This means that the expected accuracy of the algorithm is always increasing with the number of samples.…”
Section: A Submodular Costsmentioning
confidence: 99%
“…This knowledge provides a useful mechanism for efficient exploration. Motivated by this observation, the "Knows What It Knows" (or KWIK) framework is proposed to capture the essence of quantifying estimation uncertainty (Li et al, 2011).…”
Section: Knows What It Knowsmentioning
confidence: 99%