Neural Architectures for Named Entity Recognition

Lample, Guillaume; Ballesteros, Miguel; Subramanian, Sandeep; Kawakami, Kazuya; Dyer, Chris

doi:10.18653/v1/n16-1030

Cited by 3,273 publications

(2,970 citation statements)

References 25 publications

Supporting

Mentioning

2,927

Contrasting

Unclassified

Order By: Relevance

“…More recent work on sequence labeling tasks relies instead on deep learning techniques such as convolutional or recurrent neural network models (CNNs LeCun et al, 1989 andRNNs Rumelhart, 1986, respectively), without the need for any hand-crafted features (Kim, 2014;Huang et al, 2015;Zhang et al, 2015;Chiu and Nichols, 2016;Lample et al, 2016;Ma and Hovy, 2016;Yang et al, 2016;Strubell et al, 2017). RNNs in particular, typically rely on a neural network architecture built using one or more Bidirectional LongShort Term Memory (BiLSTM) layers, as this type of neural cell provides for variable-length memory allowing the model to capture relationships within sequences of proximal words.…”

Section: Related Workmentioning

confidence: 99%

“…Such architectures have achieved state-of-the-art performance for both POS and NER tasks on popular datasets (Reimers and Gurevych, 2017b). Current state-of-the-art architectures for sequence labeling include the use of a CRF prediction layer (Huang et al, 2015) and the use of character-level word embeddings to complement word embeddings, trained either with CNNs (Ma and Hovy, 2016) or BiLSTM RNNs (Lample et al, 2016). Character-level word embeddings have indeed been shown to perform well on a variety of NLP tasks (Dos Santos and Gatti de Bayser, 2014;Kim et al, 2015;Zhang et al, 2015).…”

Section: Related Workmentioning

confidence: 99%

“…Attention mechanisms have also been proposed for the same tasks (Rei et al, 2016;Shen and Lee, 2016). In this paper we will apply, tune and compare two architectures (Lample et al, 2016;Ma and Hovy, 2016) to the specific task of reference mining.…”

Section: Related Workmentioning

confidence: 99%

“…We consider a recurrent architecture organized into three layers: input (word representations), inner and prediction, following the best performing models for sequence labeling tasks (Lample et al, 2016;Ma and Hovy, 2016). The network firstly receives a sequence of (one-hot encoded) words w (1) , w (2) , ..., w (n) as input and transforms it into a sequence of dense vectors x (1) , x (2) , ..., x (n) , using a combination of word and character-level word embeddings.…”

Section: Modelmentioning

confidence: 99%

“…A blank cell indicates that the specific component was not included. experiments on the architecture of the model always used the configuration given in Table 2, following Lample et al (2016). Word embeddings can be integrated in a model architecture in three ways:…”

Section: Architecturementioning

confidence: 99%

See 4 more Smart Citations

Deep Reference Mining From Scholarly Literature in the Arts and Humanities

Alves

Colavizza

Kaplan

2018

Front. Res. Metr. Anal.

View full text Add to dashboard Cite

We consider the task of reference mining: the detection, extraction and classification of references within the full text of scholarly publications. Reference mining brings forward specific challenges, such as the need to capture the morphology of highly abbreviated words and the dependence among the elements of a reference, both following codified reference styles. This task is particularly difficult, and little explored, with respect to the literature in the arts and humanities, where references are mostly given in footnotes. We apply a deep learning architecture for reference mining from the full text of scholarly publications. We explore and discuss three architectural components: word and character-level word embeddings, different prediction layers (Softmax and Conditional Random Fields) and multi-task over single-task learning. Our best model uses both pre-trained word embeddings and characters embeddings, and a BiLSTM-CRF architecture. We test our solution on a dataset of annotated references from the historiography on Venice and, using a linear-chain CRF classifier as a baseline, we show that this deep learning architecture improves by a considerable margin. Furthermore, multi-task learning performs almost on par with a single-task approach. We thus confirm that there are important gains to be had by adopting deep learning for the task of reference mining.

show abstract

Section: Related Workmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

Section: Modelmentioning

confidence: 99%

Section: Architecturementioning

confidence: 99%

See 3 more Smart Citations

Deep Reference Mining From Scholarly Literature in the Arts and Humanities

Alves

Colavizza

Kaplan

2018

Front. Res. Metr. Anal.

View full text Add to dashboard Cite

show abstract

Evaluating Named Entity Recognition

Nouvel¹,

Ehrmann²,

Rosset³

2016

Named Entities for Computational Linguistics

View full text Add to dashboard Cite

Intelligent pig‐raising knowledge question‐answering system based on neural network schemes

Kung

Chen

et al. 2021

Agronomy Journal

View full text Add to dashboard Cite

The most important parts of Taiwan's agriculture are animal husbandry and the pig industry, of which the output value reached NT$75.6 billion in 2017. Taiwan has a high technical level of pig raising. However, practical pig-raising skills rely mainly on the inheritance of mentoring experience. The livestock and pig breeding industry in Taiwan has no relevant pig breeding knowledge management information system and no intelligent knowledge question-answering system. Therefore, this study designs and implements an intelligent knowledge question-and-answer system for pig farming. To identify intelligent questions and answers for raising livestock pigs, this study addresses the following issues: (a) to determine the semantic meaning of a sentence, the system needs to accurately interpret the meaning of a question and to identify the expression of a knowledge entity. Therefore, this study applies the lattice long-shortterm memory (LSTM) and structured perceptron methods to parse sentences accurately and correctly perform word segmentation. A set of stop words for pig raising in Taiwan is initially established to ensure accurate sentence parsing; (b) to understand the intent of each sentence, the bidirectional gated recurrent init (bi-GRU) method is adopted to realize the knowledge extraction of the livestock in the question and complete the intent detection and slot filling. The bi-GRU extracts the correct livestock knowledge and classifies it into suitable topics; (c) the wide range of knowledge sources in questions often leads to unrelated vocabulary, repeated words, and structural loss in potential answers. To establish an effective knowledge search graph, the method infers implicit knowledge from the existing knowledge and conducts related knowledge retrieval based on the inference results. The knowledge data model is

show abstract

Neural Architectures for Named Entity Recognition

Cited by 3,273 publications

References 25 publications

Deep Reference Mining From Scholarly Literature in the Arts and Humanities

Deep Reference Mining From Scholarly Literature in the Arts and Humanities

Evaluating Named Entity Recognition

Intelligent pig‐raising knowledge question‐answering system based on neural network schemes

Contact Info

Product

Resources

About