Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Langua 2021
DOI: 10.18653/v1/2021.naacl-main.118
|View full text |Cite
|
Sign up to set email alerts
|

GEMNET: Effective Gated Gazetteer Representations for Recognizing Complex Entities in Low-context Input

Abstract: Named Entity Recognition (NER) remains difficult in real-world settings; current challenges include short texts (low context), emerging entities, and complex entities (e.g. movie names). Gazetteer features can help, but results have been mixed due to challenges with adding extra features, and a lack of realistic evaluation data. It has been shown that including gazetteer features can cause models to overuse or underuse them, leading to poor generalization. We propose GEMNET, a novel approach for gazetteer know… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
51
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 42 publications
(51 citation statements)
references
References 26 publications
0
51
0
Order By: Relevance
“…In order to improve the performance of our models on low-context instances, a set of annotated sentences are generated from the MS-MARCO QnA corpus (V2.1) (Nguyen et al, 2016) and the ORCAS dataset (Craswell et al, 2020), which are mentioned in Meng et al (2021). Our trained models and existing NER systems (e.g., spaCy) are applied to identify entities in these corpora, and only templates identically recognized by all models have been reserved.…”
Section: Data Preparationmentioning
confidence: 99%
See 1 more Smart Citation
“…In order to improve the performance of our models on low-context instances, a set of annotated sentences are generated from the MS-MARCO QnA corpus (V2.1) (Nguyen et al, 2016) and the ORCAS dataset (Craswell et al, 2020), which are mentioned in Meng et al (2021). Our trained models and existing NER systems (e.g., spaCy) are applied to identify entities in these corpora, and only templates identically recognized by all models have been reserved.…”
Section: Data Preparationmentioning
confidence: 99%
“…One of the classic approaches to solving this problem is to integrate external entity knowledge or gazetteers into neural architectures (Liu et al, 2019;Rijhwani et al, 2020;Meng et al, 2021). Typically, the two representations from a language model like BERT (Devlin et al, 2019) and a gazetteer network like BiLSTM (Hochreiter and Schmidhuber, 1997) respectively are combined as one merged embedding, which is further fed into a NER classifier such as a conditional random field (CRF) (Lafferty et al, 2001).…”
Section: Introductionmentioning
confidence: 99%
“…The algorithms used were Decision tree, Long Short-Term Memory (LSTM), and Conditional Random Field (CRF). (Meng et al, 2021;Fetahu et al, 2021) presented a novel CM NER model. They proposed a gated architecture that enhances existing multilingual Transformers by dynamically infusing multilingual knowledge bases, a.k.a gazetteers.…”
Section: Related Workmentioning
confidence: 99%
“…Complex named entities (e.g., titles of creative works) are typically not simple nouns and are harder to recognize. The challenges of NER for recognizing complex entities and in lowcontext situations was recently outlined by Meng et al (2021b). Other work has extended this to multilingual and code-mixed settings (Fetahu et al, 2021).…”
Section: Related Workmentioning
confidence: 99%
“…Recognizing complex named entities (NEs) is a challenging research problem, but it has not received sufficient attention from the natural language processing community (Meng et al, 2021a;Fetahu et al, 2021). Complex NEs can be complex noun phrases (e.g., National Baseball Hall of Fame and Museum), gerunds (e.g., Saving Private Ryan), infinitives (e.g., To Build a Fire), or even full clauses (e.g., I Capture The Castle).…”
Section: Introductionmentioning
confidence: 99%