Ruixue Ding scite author profile

Gazetteers were shown to be useful resources for named entity recognition (NER) (Ratinov and Roth, 2009). Many existing approaches to incorporating gazetteers into machine learning based NER systems rely on manually defined selection strategies or handcrafted templates, which may not always lead to optimal effectiveness, especially when multiple gazetteers are involved. This is especially the case for the task of Chinese NER, where the words are not naturally tokenized, leading to additional ambiguities. To automatically learn how to incorporate multiple gazetteers into an NER system, we propose a novel approach based on graph neural networks with a multidigraph structure that captures the information that the gazetteers offer. Experiments on various datasets show that our model is effective in incorporating rich gazetteer information while resolving ambiguities, outperforming previous approaches.

show abstract

Better Modeling of Incomplete Annotations for Named Entity Recognition

Jie¹,

Xie²,

Lu³

et al. 2019

View full text Add to dashboard Cite

Supervised approaches to named entity recognition (NER) are largely developed based on the assumption that the training data is fully annotated with named entity information. However, in practice, annotated data can often be imperfect with one typical issue being the training data may contain incomplete annotations. We highlight several pitfalls associated with learning under such a setup in the context of NER and identify limitations associated with existing approaches, proposing a novel yet easy-to-implement approach for recognizing named entities with incomplete data annotations. We demonstrate the effectiveness of our approach through extensive experiments. 1

show abstract

Knowledge-aware Named Entity Recognition with Alleviating Heterogeneity

Ding

Xie²,

Huang³

et al. 2021

AAAI

View full text Add to dashboard Cite

Named Entity Recognition (NER) is a fundamental and important research topic for many downstream NLP tasks, aiming at detecting and classifying named entities (NEs) mentioned in unstructured text into pre-defined categories. Learning from labeled data only is far from enough when it comes to domain-specific or temporally-evolving entities (medical terminologies or restaurant names). Luckily, open-source Knowledge Bases (KBs) (Wikidata and Freebase) contain NEs that are manually labeled with predefined types in different domains, which is potentially beneficial to identify entity boundaries and recognize entity types more accurately. However, the type system of a domain-specific NER task is typically independent of that of current KBs and thus exhibits heterogeneity issue inevitably, which makes matching between the original NER and KB types (Person in NER potentially matches President in KBs) less likely, or introduces unintended noises without considering domain-specific knowledge (Band in NER should be mapped to Out_of_Entity_Types in the restaurant-related task). To better incorporate and denoise the abundant knowledge in KBs, we propose a new KB-aware NER framework (KaNa), which utilizes type-heterogeneous knowledge to improve NER. Specifically, for an entity mention along with a set of candidate entities that are linked from KBs, KaNa first uses a type projection mechanism that maps the mention type and entity types into a shared space to homogenize the heterogeneous entity types. Then, based on projected types, a noise detector filters out certain less-confident candidate entities in an unsupervised manner. Finally, the filtered mention-entity pairs are injected into a NER model as a graph to predict answers. The experimental results demonstrate KaNa's state-of-the-art performance on five public benchmark datasets from different domains.

show abstract

Event Extraction with Deep Contextualized Word Representation and Multi-attention Layer

Ding

2018

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Ruixue Ding

A Neural Multi-digraph Model for Chinese NER with Gazetteers

Better Modeling of Incomplete Annotations for Named Entity Recognition

Knowledge-aware Named Entity Recognition with Alleviating Heterogeneity

Event Extraction with Deep Contextualized Word Representation and Multi-attention Layer

Contact Info

Product

Resources

About