Multiclass classification with bandit feedback using adaptive regularization

Crammer, Koby; Gentile, Claudio

doi:10.1007/s10994-012-5321-8

Cited by 119 publications

(95 citation statements)

References 12 publications

(12 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In this context, learning strategies to automatically adjust the trade-off parameter for the Bandit model were presented [13]. In [14], a multi-class classification scheme was introduced with the bandit feedback using adaptive regularization. The approach is based on the 2 nd -order perceptron and makes use of the upper confidence bounds (UCBs) [15] to balance the trade-off between exploration and exploitation for improving the bounds of the Banditron [10].…”

Section: Related Workmentioning

confidence: 99%

“…On the other hand RLS [16] methods based on linear regression has shown performance better than the Banditron class of algorithms [14] and similar to that of SVMs and other kernelbased methods [16], with the additional ability to provide confidence estimates of the predicted samples. Therefore, in this paper we propose interactive incremental learning algorithms derived from RLS methods.…”

Section: Related Workmentioning

confidence: 99%

“…However, when the feedback is an "incorrect prediction" (i.e., l t = −1), we can only infer just the binary label z i t of the chosen predictor i. Thus, the update rule is modified accordingly, as described in Algorithm 2 (lines [14][15][16][17][18][19][20][21][22]. Since each predictor may be updated differently, besides separate vectors b i as in the full feedback case, we need to maintain a separate matrix A i for each predictor i.…”

Section: B Partial Feedbackmentioning

confidence: 99%

“…We adopt the widely used upper confidence bound UCB = γ i t + σ i t [15], [14] to optimistically select the predictor (line 12 of Algorithm 2).…”

Section: B Partial Feedbackmentioning

confidence: 99%

“…//Predict with confidence interval It is worth mentioning that, similar to the approach employed in [14], our Algorithm 2 also maintains additional confidence information associated with each prediction, making an optimistic prediction using the highest UCB. However, as opposed to the algorithm in [14], our algorithm maintains separate binary predictors in a one-vs-all approach, allowing all the predictors to get updated when a positive feedback is available (lines 14-18 in Algorithm 2), being in this way more label-efficient.…”

Section: B Partial Feedbackmentioning

confidence: 99%

See 4 more Smart Citations

Incremental learning using partial feedback for gesture-based human-swarm interaction

Nagi

Ngo

Giusti

et al. 2012

2012 IEEE RO-MAN: The 21st IEEE International Symposium on Robot and Human Interactive Communication

View full text Add to dashboard Cite

Abstract-In this paper we consider a human-swarm interaction scenario based on hand gestures. We study how the swarm can incrementally learn hand gestures through the interaction with a human instructor providing training gestures and correction feedback. The main contribution of the paper is a novel incremental machine learning approach that makes the robot swarm learn and recognize the gestures in a distributed and decentralized fashion using binary (i.e., yes/no) feedback. It exploits cooperative information exchange and swarm's intrinsic parallelism and redundancy. We perform extensive tests using real gesture images, showing that good classification accuracies are obtained even with rather few training samples and relatively small swarms. We also show the good scalability of the approach and its relatively low requirements in terms of communication overhead.

show abstract

Section: Related Workmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

Section: B Partial Feedbackmentioning

confidence: 99%

“…We adopt the widely used upper confidence bound UCB = γ i t + σ i t [15], [14] to optimistically select the predictor (line 12 of Algorithm 2).…”

Section: B Partial Feedbackmentioning

confidence: 99%

Section: B Partial Feedbackmentioning

confidence: 99%

See 3 more Smart Citations

Incremental learning using partial feedback for gesture-based human-swarm interaction

Nagi

Ngo

Giusti

et al. 2012

2012 IEEE RO-MAN: The 21st IEEE International Symposium on Robot and Human Interactive Communication

View full text Add to dashboard Cite

show abstract

Companion Losses for Deep Neural Networks

Díaz-Vico

Fernández

Dorronsoro

2021

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Modern Deep Neuronal Network backends allow a great flexibility to define network architectures. This allows for multiple outputs with their specific losses which can make them more suitable for particular goals. In this work we shall explore this possibility for classification networks which will combine the categorical cross-entropy loss, typical of softmax probabilistic outputs, the categorical hinge loss, which extends the hinge loss standard on SVMs, and a novel Fisher loss which seeks to concentrate class members near their centroids while keeping these apart.

show abstract

ALBIF: Active Learning with BandIt Feedbacks

Agarwal,

Manwani

2022

Advances in Knowledge Discovery and Data Mining

View full text Add to dashboard Cite

Multiclass classification with bandit feedback using adaptive regularization

Cited by 119 publications

References 12 publications

Incremental learning using partial feedback for gesture-based human-swarm interaction

Incremental learning using partial feedback for gesture-based human-swarm interaction

Companion Losses for Deep Neural Networks

ALBIF: Active Learning with BandIt Feedbacks

Contact Info

Product

Resources

About