Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume 2021
DOI: 10.18653/v1/2021.eacl-main.145
|View full text |Cite
|
Sign up to set email alerts
|

Active Learning for Sequence Tagging with Deep Pre-trained Models and Bayesian Uncertainty Estimates

Abstract: Annotating training data for sequence tagging of texts is usually very time-consuming. Recent advances in transfer learning for natural language processing in conjunction with active learning open the possibility to significantly reduce the necessary annotation budget. We are the first to thoroughly investigate this powerful combination for the sequence tagging task. We conduct an extensive empirical study of various Bayesian uncertainty estimation methods and Monte Carlo dropout options for deep pretrained mo… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

2
22
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
3
3
1

Relationship

0
7

Authors

Journals

citations
Cited by 15 publications
(26 citation statements)
references
References 27 publications
2
22
0
Order By: Relevance
“…We follow the AL settings in previous work to achieve consistent evaluation (Kim, 2020;Shelmanov et al, 2021;Liu et al, 2022). Specifically, the unlabeled pool is created by discarding labels from the original training data of each dataset; 2% of which (∼ 242 sentences) is selected for labeling at each iteration for a total of 25 iterations (examples of the first iteration are randomly sampled to serve as the seed D 0 ).…”
Section: Discussionmentioning
confidence: 99%
See 3 more Smart Citations
“…We follow the AL settings in previous work to achieve consistent evaluation (Kim, 2020;Shelmanov et al, 2021;Liu et al, 2022). Specifically, the unlabeled pool is created by discarding labels from the original training data of each dataset; 2% of which (∼ 242 sentences) is selected for labeling at each iteration for a total of 25 iterations (examples of the first iteration are randomly sampled to serve as the seed D 0 ).…”
Section: Discussionmentioning
confidence: 99%
“…As discussed in the introduction, model training and data selection at each iteration of traditional AL methods might consume significant time (especially with the current trend of large-scale language models), thus introducing a long idle time for annotators that might reduce annotation quality and quantity. To this end, (Shelmanov et al, 2021) have explored approaches to accelerate training and data selection steps for AL by leveraging smaller and approximate models during the AL iterations. To make it more efficient, the main large model is only trained once in the end over all the annotated examples in AL.…”
Section: Proxy Active Learningmentioning
confidence: 99%
See 2 more Smart Citations
“…AL and example selection Active learning (AL) (Lewis and Catlett, 1994;Settles and Craven, 2008;Settles, 2009;Houlsby et al, 2011) is a well-studied field that investigates how machine learning algorithms might automatically select helpful additional data points to maximize their performance. Such strategies are especially helpful in imbalanced settings (Ertekin et al, 2007;Mussmann et al, 2020) and have been fruitfully applied to deep models (Gal et al, 2017;Beluch et al, 2018), including pretrained models (Yuan et al, 2020;Margatina et al, 2021;Shelmanov et al, 2021). Past work has also considered AL for few-shot learning (Woodward and Finn, 2017).…”
Section: Failure Casesmentioning
confidence: 99%