2021
DOI: 10.1109/access.2021.3135514
|View full text |Cite
|
Sign up to set email alerts
|

A Survey on Cost Types, Interaction Schemes, and Annotator Performance Models in Selection Algorithms for Active Learning in Classification

Abstract: Pool-based active learning (AL) aims to optimize the annotation process (i.e., labeling) as the acquisition of annotations is often time-consuming and therefore expensive. For this purpose, an AL strategy queries annotations intelligently from annotators to train a high-performance classification model at a low annotation cost. Traditional AL strategies operate in an idealized framework. They assume a single, omniscient annotator who never gets tired and charges uniformly regardless of query difficulty. Howeve… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
9
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
2
1

Relationship

1
7

Authors

Journals

citations
Cited by 11 publications
(9 citation statements)
references
References 121 publications
0
9
0
Order By: Relevance
“…We believe that it is vital for many tasks to distinguish between potential unknown objects and a natural background. With that addition, it would be possible, for instance, to improve the responses to different autonomous driving scenarios or perform better exploration during active learning (Herde, Huseljic, Sick, & Calma, 2021). The training of such an object detection architecture might be realized by using out-of-distribution data (Huseljic et al, 2021).…”
Section: Discussionmentioning
confidence: 99%
“…We believe that it is vital for many tasks to distinguish between potential unknown objects and a natural background. With that addition, it would be possible, for instance, to improve the responses to different autonomous driving scenarios or perform better exploration during active learning (Herde, Huseljic, Sick, & Calma, 2021). The training of such an object detection architecture might be realized by using out-of-distribution data (Huseljic et al, 2021).…”
Section: Discussionmentioning
confidence: 99%
“…In terms of the data quality [47][48][49], these can be outliers that may have nothing to do with the actual dataset or external influences that permanently degrade the data, e.g., overlays by other signals, or overexposure and motion blur in case of cameras. Another source of corner cases in ML is incorrect, noisy, or erroneous labeled data, which can lead to devastating performance drops of the ML model [6,7,50]…”
Section: Qualitymentioning
confidence: 99%
“…The influence of ML methods is not surprising considering their impressive performance, e.g., object detection in images [2,3], speech recognition [4], large language models [5], and other applications. Nevertheless, all these tasks have something in common as they rely on data, which is often imbalanced or incomplete, and the labels can be inaccurate or inconsistent [6,7]. Interpreting and modeling the epistemic and aleatoric uncertainty [8] of an ML model with techniques such as MC-Dropout [9], Bayes by Backprop [10],…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Research in IML explores ways to learn and manipulate models through an intuitive human-computer interface (Michael et al, 2020) and encompasses a variety of learning and interaction strategies. Perhaps the most well-known IML framework is active learning (Settles, 2012;Herde et al, 2021), which tackles learning high-performance predictors in settings in which supervision is expensive. To this end, active learning algorithms interleave acquiring labels of carefully selected unlabeled instances from an annotator and model updates.…”
Section: Explanations In Interactive Machine Learningmentioning
confidence: 99%