Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume 2021
DOI: 10.18653/v1/2021.eacl-main.318
|View full text |Cite
|
Sign up to set email alerts
|

GLaRA: Graph-based Labeling Rule Augmentation for Weakly Supervised Named Entity Recognition

Abstract: Instead of using expensive manual annotations, researchers have proposed to train named entity recognition (NER) systems using heuristic labeling rules. However, devising labeling rules is challenging because it often requires a considerable amount of manual effort and domain expertise. To alleviate this problem, we propose GLARA, a graph-based labeling rule augmentation framework, to learn new labeling rules from unlabeled data. We first create a graph with nodes representing candidate rules extracted from un… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
10
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 9 publications
(11 citation statements)
references
References 41 publications
0
10
0
Order By: Relevance
“…Zhao et al [161] propose a weakly supervised method where they manually prepare some seeding rules and automatically extract all possible rules from unlabeled text for each of the six rule types, and connect them in a graph using cosine similarity. Note that the rule is represented by the average contextual embedding of its matched candidate entities.…”
Section: Rule-based Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Zhao et al [161] propose a weakly supervised method where they manually prepare some seeding rules and automatically extract all possible rules from unlabeled text for each of the six rule types, and connect them in a graph using cosine similarity. Note that the rule is represented by the average contextual embedding of its matched candidate entities.…”
Section: Rule-based Methodsmentioning
confidence: 99%
“…However, this method is restricted by the entity labeling granularity where we can find some nested entities. The method of Zhao et al [161] avoids the ambiguity as it automatically propagates some seed rules based on lexical or contextual clues which are strong indicators of entity recognition. In addition, the authors have fine-tuned a pre-trained contextual embedding model BERT in the biomedical domain.…”
Section: Nimentioning
confidence: 99%
“…The methods generally require an initial set of labeled data, or seed LFs developed by users. Snuba [33] learns weak classifiers as heuristic models from a small labeled dataset; TALLOR [16] and GLaRA [39] use an initial set of seed LFs to generate new ones by compounding multiple simpler LFs and by exploiting the semantic relationship of the seed LFs respectively; [31] applies program systhesis to generate task-level LFs from a set of labeled data and domain-level LFs.…”
Section: Related Workmentioning
confidence: 99%
“…[7] and [25] interactively generate labeling functions based on user feedback. TALLOR [46] and GLaRA [106] automatically augment an initial set of labeling functions with new ones. Different from existing works that optimize the task performance, the procedural labeling function generators in WRENCH facilitate the study of the impact of different weak supervision sources.…”
Section: Related Workmentioning
confidence: 99%
“…(2) Active generation and repurposing of supervision sources. To further reduce human annotation efforts, very recently, researchers turn to active generation [91,46,106,7,25] and repurposing [27] of supervision sources. In the future, we plan to incorporate these new tasks and methods into WRENCH to extend its scope.…”
Section: A3 Hosting and Maintenance Planmentioning
confidence: 99%