Lingyong Yan scite author profile

Lingyong Yan

5Publications

86Citation Statements Received

176Citation Statements Given

How they've been cited

How they cite others

144

176

Affiliations

Chinese Academy of Sciences, Institute of Software

Publications

Order By: Most citations

Knowledgeable or Educated Guess? Revisiting Language Models as Knowledge Bases

Cao¹,

Lin²,

Han³

et al. 2021

View full text Add to dashboard Cite

Previous literatures show that pre-trained masked language models (MLMs) such as BERT can achieve competitive factual knowledge extraction performance on some datasets, indicating that MLMs can potentially be a reliable knowledge source. In this paper, we conduct a rigorous study to explore the underlying predicting mechanisms of MLMs over different extraction paradigms. By investigating the behaviors of MLMs, we find that previous decent performance mainly owes to the biased prompts which overfit dataset artifacts. Furthermore, incorporating illustrative cases and external contexts improve knowledge prediction mainly due to entity type guidance and golden answer leakage. Our findings shed light on the underlying predicting mechanisms of MLMs, and strongly question the previous conclusion that current MLMs can potentially serve as reliable factual knowledge bases 1 .

show abstract

Learning to Bootstrap for Entity Set Expansion

Yan¹,

Han²,

Sun³

et al. 2019

View full text Add to dashboard Cite

Bootstrapping for Entity Set Expansion (ESE) aims at iteratively acquiring new instances of a specific target category. Traditional bootstrapping methods often suffer from two problems: 1) delayed feedback, i.e., the pattern evaluation relies on both its direct extraction quality and the extraction quality in later iterations. 2) sparse supervision, i.e., only few seed entities are used as the supervision. To address the above two problems, we propose a novel bootstrapping method combining the Monte Carlo Tree Search (MCTS) algorithm with a deep similarity network, which can efficiently estimate delayed feedback for pattern evaluation and adaptively score entities given sparse supervision signals. Experimental results confirm the effectiveness of the proposed method.

show abstract

End-to-End Bootstrapping Neural Network for Entity Set Expansion

Yan

Han

et al. 2020

AAAI

View full text Add to dashboard Cite

Bootstrapping for entity set expansion (ESE) has long been modeled as a multi-step pipelined process. Such a paradigm, unfortunately, often suffers from two main challenges: 1) the entities are expanded in multiple separate steps, which tends to introduce noisy entities and results in the semantic drift problem; 2) it is hard to exploit the high-order entity-pattern relations for entity set expansion. In this paper, we propose an end-to-end bootstrapping neural network for entity set expansion, named BootstrapNet, which models the bootstrapping in an encoder-decoder architecture. In the encoding stage, a graph attention network is used to capture both the first- and the high-order relations between entities and patterns, and encode useful information into their representations. In the decoding stage, the entities are sequentially expanded through a recurrent neural network, which outputs entities at each stage, and its hidden state vectors, representing the target category, are updated at each expansion step. Experimental results demonstrate substantial improvement of our model over previous ESE approaches.

show abstract

Knowledgeable or Educated Guess? Revisiting Language Models as Knowledge Bases

Cao

Lin

Han

et al. 2021

Preprint

View full text Add to dashboard Cite

show abstract

Global Bootstrapping Neural Network for Entity Set Expansion

Yan

Han

et al. 2020

View full text Add to dashboard Cite

Bootstrapping for entity set expansion (ESE) has been studied for a long period, which expands new entities using only a few seed entities as supervision. Recent end-to-end bootstrapping approaches have shown their advantages in information capturing and bootstrapping process modeling. However, due to the sparse supervision problem, previous endto-end methods often only leverage information from near neighborhoods (local semantics) rather than those propagated from the co-occurrence structure of the whole corpus (global semantics). To address this issue, this paper proposes Global Bootstrapping Network (GBN) with the "pre-training and fine-tuning" strategies for effective learning. Specifically, it contains a global-sighted encoder to capture and encode both local and global semantics into entity embedding, and an attention-guided decoder to sequentially expand new entities based on these embeddings. The experimental results show that the GBN learned by "pretraining and fine-tuning" strategies achieves state-of-the-art performance on two bootstrapping datasets.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Lingyong Yan

Knowledgeable or Educated Guess? Revisiting Language Models as Knowledge Bases

Learning to Bootstrap for Entity Set Expansion

End-to-End Bootstrapping Neural Network for Entity Set Expansion

Knowledgeable or Educated Guess? Revisiting Language Models as Knowledge Bases

Global Bootstrapping Neural Network for Entity Set Expansion

Contact Info

Product

Resources

About