Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2021
DOI: 10.18653/v1/2021.emnlp-main.424
|View full text |Cite
|
Sign up to set email alerts
|

ChemNER: Fine-Grained Chemistry Named Entity Recognition with Ontology-Guided Distant Supervision

Abstract: Scientific literature analysis needs fine-grained named entity recognition (NER) to provide a wide range of information for scientific discovery. For example, chemistry research needs to study dozens to hundreds of distinct, fine-grained entity types, making consistent and accurate annotation difficult even for crowds of domain experts. On the other hand, domain-specific ontologies and knowledge bases (KBs) can be easily accessed, constructed, or integrated, which makes distant supervision realistic for fine-g… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
2
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 11 publications
(4 citation statements)
references
References 32 publications
0
2
0
Order By: Relevance
“…Several strategies were developed and investigated to exploit external lexical and semantic resources to improve machine learning models. These strategies include thematic masking [1] , named entity recognition by distant supervision [2] , and ontology-based normalization [3] . The biological roles of MOs depend mainly on their structure.…”
Section: Value Of the Datamentioning
confidence: 99%
“…Several strategies were developed and investigated to exploit external lexical and semantic resources to improve machine learning models. These strategies include thematic masking [1] , named entity recognition by distant supervision [2] , and ontology-based normalization [3] . The biological roles of MOs depend mainly on their structure.…”
Section: Value Of the Datamentioning
confidence: 99%
“…Distant supervision (Mintz et al, 2009) uses structured knowledge to annotate raw text with pseudo labels. Performing distantly supervised fine-tuning with in-domain structured knowledge after the MLM pre-training is effective in domainspecific NER (Wang et al, 2021;Trieu et al, 2022). However, domain-specific distant supervised learning depends on the structured knowledge's coverage of the label set of the downstream task.…”
Section: Ner With Unstructured Knowledgementioning
confidence: 99%
“…(3) Experiments: Extensive experiments on two public datasets (Tabassum et al 2020;Bridges et al 2013) covering four domains (i.e., StackOverflow, GitHub, National Vulnerability Database, and Metasploit) demonstrate the effectiveness of SETYPE given 10 to 15 fine-grained types related to code, software, and security. Although we focus on software and security domain examples, our entity typing framework can be applied to other specialized domains including science (Wang et al 2021) and engineering (O'Gorman et al 2021).…”
Section: Introductionmentioning
confidence: 99%