Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Langua 2016
DOI: 10.18653/v1/n16-1030
|View full text |Cite
|
Sign up to set email alerts
|

Neural Architectures for Named Entity Recognition

Abstract: State-of-the-art named entity recognition systems rely heavily on hand-crafted features and domain-specific knowledge in order to learn effectively from the small, supervised training corpora that are available. In this paper, we introduce two new neural architectures-one based on bidirectional LSTMs and conditional random fields, and the other that constructs and labels segments using a transition-based approach inspired by shift-reduce parsers. Our models rely on two sources of information about words: chara… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

29
2,927
4
10

Year Published

2016
2016
2021
2021

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 3,273 publications
(2,970 citation statements)
references
References 25 publications
29
2,927
4
10
Order By: Relevance
“…More recent work on sequence labeling tasks relies instead on deep learning techniques such as convolutional or recurrent neural network models (CNNs LeCun et al, 1989 andRNNs Rumelhart, 1986, respectively), without the need for any hand-crafted features (Kim, 2014;Huang et al, 2015;Zhang et al, 2015;Chiu and Nichols, 2016;Lample et al, 2016;Ma and Hovy, 2016;Yang et al, 2016;Strubell et al, 2017). RNNs in particular, typically rely on a neural network architecture built using one or more Bidirectional LongShort Term Memory (BiLSTM) layers, as this type of neural cell provides for variable-length memory allowing the model to capture relationships within sequences of proximal words.…”
Section: Related Workmentioning
confidence: 99%
See 4 more Smart Citations
“…More recent work on sequence labeling tasks relies instead on deep learning techniques such as convolutional or recurrent neural network models (CNNs LeCun et al, 1989 andRNNs Rumelhart, 1986, respectively), without the need for any hand-crafted features (Kim, 2014;Huang et al, 2015;Zhang et al, 2015;Chiu and Nichols, 2016;Lample et al, 2016;Ma and Hovy, 2016;Yang et al, 2016;Strubell et al, 2017). RNNs in particular, typically rely on a neural network architecture built using one or more Bidirectional LongShort Term Memory (BiLSTM) layers, as this type of neural cell provides for variable-length memory allowing the model to capture relationships within sequences of proximal words.…”
Section: Related Workmentioning
confidence: 99%
“…Such architectures have achieved state-of-the-art performance for both POS and NER tasks on popular datasets (Reimers and Gurevych, 2017b). Current state-of-the-art architectures for sequence labeling include the use of a CRF prediction layer (Huang et al, 2015) and the use of character-level word embeddings to complement word embeddings, trained either with CNNs (Ma and Hovy, 2016) or BiLSTM RNNs (Lample et al, 2016). Character-level word embeddings have indeed been shown to perform well on a variety of NLP tasks (Dos Santos and Gatti de Bayser, 2014;Kim et al, 2015;Zhang et al, 2015).…”
Section: Related Workmentioning
confidence: 99%
See 3 more Smart Citations