2021
DOI: 10.48550/arxiv.2101.08133
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Active Learning for Sequence Tagging with Deep Pre-trained Models and Bayesian Uncertainty Estimates

Abstract: Annotating training data for sequence tagging tasks is usually very time-consuming. Recent advances in transfer learning for natural language processing in conjunction with active learning open the possibility to significantly reduce the necessary annotation budget. We are the first to thoroughly investigate this powerful combination in sequence tagging. We find that taggers based on deep pre-trained models can benefit from Bayesian query strategies with the help of the Monte Carlo (MC) dropout. Results of exp… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(3 citation statements)
references
References 19 publications
0
3
0
Order By: Relevance
“…Active learning for NER. Active learning for NER has seen a variety of methodologies being developed to address its unique challenges (Settles and Craven, 2008;Marcheggiani and Artieres, 2014;Shelmanov et al, 2021). The overarching goal is to reduce the budget for labeling sequence data by selectively querying informative samples.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Active learning for NER. Active learning for NER has seen a variety of methodologies being developed to address its unique challenges (Settles and Craven, 2008;Marcheggiani and Artieres, 2014;Shelmanov et al, 2021). The overarching goal is to reduce the budget for labeling sequence data by selectively querying informative samples.…”
Section: Related Workmentioning
confidence: 99%
“…The primary challenge in applying active learning to sequence tagging lies in addressing data imbalance at the entity level. Traditional methods of active sequence tagging, such as those referenced by (Shen et al, 2017;Zhang et al, 2020;Shelmanov et al, 2021) , typically generate scores for sentences by summing or averaging the tokens within them, thereby treating each token equally. Radmard et al (2021) attempted to address this by segmenting sentences for token-level selections, However this led to loss of context and semantic meaning, impairing human understanding.…”
Section: Introductionmentioning
confidence: 99%
“…For example, Agrawal [15] gives a modified least confidence-based sampling strategy, Marcheggiani [16] investigates minimum token margin(MTM) strategy that is a variant of the margin sampling strategy, or Balcan [17] offers the maximum token entropy(MTE) measure to the ambiguity about the label of a token. In addition, Bayesian uncertainty estimation method is conducted in [18].These techniques ensure choose crucial samples for training and minimize labeling cost.…”
Section: Related Workmentioning
confidence: 99%