2022
DOI: 10.48550/arxiv.2202.02794
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Active Learning on a Budget: Opposite Strategies Suit High and Low Budgets

Abstract: Investigating active learning, we focus on the relation between the number of labeled examples (budget size), and suitable corresponding querying strategies. Our theoretical analysis shows a behavior reminiscent of phase transition: typical points should best be queried in the low budget regime, while atypical (or uncertain) points are best queried when the budget is large. Combined evidence from our theoretical and empirical studies shows that a similar phenomenon occurs in simple classification models. Accor… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

1
4
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
2
2
1

Relationship

1
4

Authors

Journals

citations
Cited by 5 publications
(5 citation statements)
references
References 28 publications
1
4
0
Order By: Relevance
“…We see that our method outperforms random sampling and other AL baselines by a large margin. We note that in agreement with previous works [6,18], AL strategies that are suited for high budgets do not improve the results of random sampling, while AL strategies that are suited for low budgets do.…”
Section: Resultssupporting
confidence: 91%
See 1 more Smart Citation
“…We see that our method outperforms random sampling and other AL baselines by a large margin. We note that in agreement with previous works [6,18], AL strategies that are suited for high budgets do not improve the results of random sampling, while AL strategies that are suited for low budgets do.…”
Section: Resultssupporting
confidence: 91%
“…When the budget contains only a few examples, they will struggle to improve the model's performance, not even reaching the accuracy of the random baseline. Recently, it was shown that uncertainty sampling is inherently unsuited for the low budget regime, which may explain the cold start phenomenon [18]. The low-budget scenario is relevant in many applications, especially those requiring an expert tagger whose time is expensive (e.g a radiologist tagger for tumor detection).…”
Section: Introductionmentioning
confidence: 99%
“…In addition to the above methods, there are also methods that combine active learning and strong data enhancement, such as LADA [46], which not only provides an active learning method that can work under strong data enhancement, but also builds a learnable data enhancement method in turn to improve the effect of data enhancement. They also include Vab-AL [47], which realizes the most valuable sample selection in the case of category imbalance, Revival [48] and Boostmis [49], which combine active learning with semi-supervised learning, and Cluster-Margin [50] and TypiClust [51] which are improved based on the sample size Budget of each selection.…”
Section: Active Learningmentioning
confidence: 99%
“…An important result that connects the paradigm of AL with the available sampling budget is given in [279] who theoretically derive that the best AL strategy crucially depends on the available labeling budget. More precisely, for a high budget, one should query uncertain samples while for a low budget, one should focus on "typical" points where they define the typicality as the inverse of the average squared distance to the nearest neighbors.…”
Section: Different Paradigms For Active Learningmentioning
confidence: 99%