2012
DOI: 10.1007/s10994-012-5321-8
|View full text |Cite
|
Sign up to set email alerts
|

Multiclass classification with bandit feedback using adaptive regularization

Abstract: We present a new multiclass algorithm in the bandit framework, where after making a prediction, the learning algorithm receives only partial feedback, i.e., a single bit of right-orwrong, rather then the true label. Our algorithm is based on the 2nd-order Perceptron, and uses upper-confidence bounds to trade off exploration and exploitation. We analyze this algorithm in a partial adversarial setting, where instances are chosen adversarially, while the labels are chosen according to a linear probabilistic model… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

2
93
0

Year Published

2012
2012
2022
2022

Publication Types

Select...
5
1

Relationship

1
5

Authors

Journals

citations
Cited by 119 publications
(95 citation statements)
references
References 12 publications
(12 reference statements)
2
93
0
Order By: Relevance
“…In this context, learning strategies to automatically adjust the trade-off parameter for the Bandit model were presented [13]. In [14], a multi-class classification scheme was introduced with the bandit feedback using adaptive regularization. The approach is based on the 2 nd -order perceptron and makes use of the upper confidence bounds (UCBs) [15] to balance the trade-off between exploration and exploitation for improving the bounds of the Banditron [10].…”
Section: Related Workmentioning
confidence: 99%
See 4 more Smart Citations
“…In this context, learning strategies to automatically adjust the trade-off parameter for the Bandit model were presented [13]. In [14], a multi-class classification scheme was introduced with the bandit feedback using adaptive regularization. The approach is based on the 2 nd -order perceptron and makes use of the upper confidence bounds (UCBs) [15] to balance the trade-off between exploration and exploitation for improving the bounds of the Banditron [10].…”
Section: Related Workmentioning
confidence: 99%
“…On the other hand RLS [16] methods based on linear regression has shown performance better than the Banditron class of algorithms [14] and similar to that of SVMs and other kernelbased methods [16], with the additional ability to provide confidence estimates of the predicted samples. Therefore, in this paper we propose interactive incremental learning algorithms derived from RLS methods.…”
Section: Related Workmentioning
confidence: 99%
See 3 more Smart Citations