Contextualized Non-Local Neural Networks for Sequence Learning

Liu, Pengfei; Chang, Shuaichen; Huang, Xuanjing; Tang, Jian; Cheung, Jackie Chi Kit

doi:10.1609/aaai.v33i01.33016762

Cited by 37 publications

(16 citation statements)

References 2 publications

(2 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…We can conclude that sentence-context information is not significantly helpful for disease mention normalization. This is not consistent with the descovery in previous work, in which sentence information has been shown useful for other NLP tasks [45], [46].…”

Section: B Effect Of Sentence-context Informationcontrasting

confidence: 94%

Investigating of Disease Name Normalization Using Neural Network and Pre-Training

Lou

Qian

et al. 2020

IEEE Access

View full text Add to dashboard Cite

Normalizing disease names is a crucial task for biomedical and healthcare domains. Previous work explored various approaches, including rules, machine learning and deep learning, which focused on only one approach or one model. In this study, we systematically investigated the performances of various neural models and the effects of different features. Our investigation was performed on two benchmark datasets, namely the NCBI disease corpus and the BioCreative V Chemical Disease Relation (BC5CDR) corpus. The convolutional neural network (CNN) performed the best (F1 90.11%) in the NCBI disease corpus and the attention neural network (Attention) performed the best (F1 90.78%) in the BC5CDR corpus. Compared with the state-of-the-art system, DNorm, our models improved the F1s by 1.74% and 0.86% respectively. In terms of features, character information could improve the F1 by about 0.5-1.0% while sentence information worsened the F1 by about 3-4%. Moreover, we proposed a novel approach for pretraining models, which improved the F1 by up to 9%. The CNN and Attention models are comparable in the task of disease name normalization while the recurrent neural network performs much worse. In addition, character information and pre-training techniques are helpful for this task while sentence information hurts the performance. Our proposed models and pre-training approach can be easily adapted to the normalization task for any other type of entities. Our source code is available at: https://github.com/yx100/EntityNorm. INDEX TERMS Deep learning, disease name normalization, text mining, natural language processing.

show abstract

Section: B Effect Of Sentence-context Informationcontrasting

confidence: 94%

Investigating of Disease Name Normalization Using Neural Network and Pre-Training

Lou

Qian

et al. 2020

IEEE Access

View full text Add to dashboard Cite

show abstract

“…The success of Transformer has raised a large body of follow-up work. Therefore, some Transformer variations are also proposed, such as GPT (Radford et al, 2018), BERT (Devlin et al, 2018), Transformer-XL (Dai et al, 2019) , Universal Transformer (Dehghani et al, 2018) and CN 3 (Liu et al, 2018a).…”

Section: Related Workmentioning

confidence: 99%

Star-Transformer

Guo

Qiu

Liu

et al. 2019

Proceedings of the 2019 Conference of the North

Self Cite

154

View full text Add to dashboard Cite

Although Transformer has achieved great successes on many NLP tasks, its heavy structure with fully-connected attention connections leads to dependencies on large training data. In this paper, we present Star-Transformer, a lightweight alternative by careful sparsification. To reduce model complexity, we replace the fully-connected structure with a star-shaped topology, in which every two non-adjacent nodes are connected through a shared relay node. Thus, complexity is reduced from quadratic to linear, while preserving capacity to capture both local composition and long-range dependency. The experiments on four tasks (22 datasets) show that Star-Transformer achieved significant improvements against the standard Transformer for the modestly sized datasets.

show abstract

“…GNN for NLP: Recently, there is considerable amount of interest in applying GNN to NLP tasks and great success has been achieved. For example, in neural machine translation, GNN has been employed to integrate syntactic and semantic information into encoders (Bastings et al, 2017;Marcheggiani et al, 2018); applied GNN to relation extraction over pruned dependency trees; the study by Yao et al (2018) employed GNN over a heterogeneous graph to do text classification, which inspires our idea of the HDE graph; Liu et al (2018) proposed a new contextualized neural network for sequence learning by leveraging various types of non-local contextual information in the form of information passing over GNN. These studies are related to our work in the sense that we both use GNN to improve the information interaction over long context or across documents.…”

Section: Related Workmentioning

confidence: 99%

Multi-hop Reading Comprehension across Multiple Documents by Reasoning over Heterogeneous Graphs

Tao¹,

Wang²,

Huang³

et al. 2019

Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

135

View full text Add to dashboard Cite

Multi-hop reading comprehension (RC) across documents poses new challenge over singledocument RC because it requires reasoning over multiple documents to reach the final answer. In this paper, we propose a new model to tackle the multi-hop RC problem. We introduce a heterogeneous graph with different types of nodes and edges, which is named as Heterogeneous Document-Entity (HDE) graph. The advantage of HDE graph is that it contains different granularity levels of information including candidates, documents and entities in specific document contexts. Our proposed model can do reasoning over the HDE graph with nodes representation initialized with co-attention and self-attention based context encoders. We employ Graph Neural Networks (GNN) based message passing algorithms to accumulate evidences on the proposed HDE graph. Evaluated on the blind test set of the Qangaroo WIKIHOP data set, our HDE graph based single model delivers competitive result, and the ensemble model achieves the state-of-the-art performance.

show abstract

Contextualized Non-Local Neural Networks for Sequence Learning

Cited by 37 publications

References 2 publications

Investigating of Disease Name Normalization Using Neural Network and Pre-Training

Investigating of Disease Name Normalization Using Neural Network and Pre-Training

Star-Transformer

Multi-hop Reading Comprehension across Multiple Documents by Reasoning over Heterogeneous Graphs

Contact Info

Product

Resources

About