Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Confer 2021
DOI: 10.18653/v1/2021.acl-long.482
|View full text |Cite
|
Sign up to set email alerts
|

BERTifying the Hidden Markov Model for Multi-Source Weakly Supervised Named Entity Recognition

Abstract: We study the problem of learning a named entity recognition (NER) tagger using noisy labels from multiple weak supervision sources. Though cheap to obtain, the labels from weak supervision sources are often incomplete, inaccurate, and contradictory, making it difficult to learn an accurate NER model. To address this challenge, we propose a conditional hidden Markov model (CHMM), which can effectively infer true labels from multi-source noisy labels in an unsupervised way. CHMM enhances the classic hidden Marko… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
29
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
3
2
1

Relationship

2
4

Authors

Journals

citations
Cited by 15 publications
(29 citation statements)
references
References 35 publications
0
29
0
Order By: Relevance
“…Most studies focus on developing label models while leaving the end model flexible to the downstream tasks. Existing label models include Majority Voting (MV), Probabilistic Graphical Models (PGM) [14,77,75,22,53,82,50], etc.. Note that prior crowd-worker modeling work can be included and subsumed by this set of approaches, e.g.…”
Section: Two-stage Methodsmentioning
confidence: 99%
See 4 more Smart Citations
“…Most studies focus on developing label models while leaving the end model flexible to the downstream tasks. Existing label models include Majority Voting (MV), Probabilistic Graphical Models (PGM) [14,77,75,22,53,82,50], etc.. Note that prior crowd-worker modeling work can be included and subsumed by this set of approaches, e.g.…”
Section: Two-stage Methodsmentioning
confidence: 99%
“…[53,71] use a standard HMM with multiple observed variables, each from one labeling source. [82] improves HMM by introducing unique linking rules as an additional supervision source; [50] predicts token-wise transition and emission probabilities from BERT embeddings to utilize the context information. Besides, [45] is an one-stage method that models each labeling source by a CRF layer and aggregates their transitions with an attention network.…”
Section: Related Workmentioning
confidence: 99%
See 3 more Smart Citations