Proceedings of the First Workshop on Subword and Character Level Models in NLP 2017
DOI: 10.18653/v1/w17-4118
|View full text |Cite
|
Sign up to set email alerts
|

A General-Purpose Tagger with Convolutional Neural Networks

Abstract: We present a general-purpose tagger based on convolutional neural networks (CNN), used for both composing word vectors and encoding context information. The CNN tagger is robust across different tagging tasks: without task-specific tuning of hyper-parameters, it achieves state-of-theart results in part-of-speech tagging, morphological tagging and supertagging. The CNN tagger is also robust against the outof-vocabulary problem; it performs well on artificially unnormalized texts.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
16
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
5
4
1

Relationship

0
10

Authors

Journals

citations
Cited by 17 publications
(16 citation statements)
references
References 16 publications
0
16
0
Order By: Relevance
“…Neural Multiclass classifier (MC) As the second baseline, we employ the standard multiclass classifier used by both Heigold et al (2017) and Yu et al (2017). The proposed model consists of an LSTM-based encoder, identical to the one described above in section 3.3, and a softmax classifier over the full tagset.…”
Section: Baseline Modelsmentioning
confidence: 99%
“…Neural Multiclass classifier (MC) As the second baseline, we employ the standard multiclass classifier used by both Heigold et al (2017) and Yu et al (2017). The proposed model consists of an LSTM-based encoder, identical to the one described above in section 3.3, and a softmax classifier over the full tagset.…”
Section: Baseline Modelsmentioning
confidence: 99%
“…To generate the pre-trained word embeddings, we have used FastText (https://fasttext.cc/docs/ en/crawl-vectors.html (accessed on 12 June 2018)) embeddings corresponding to each language. To construct the character-based word composition vector, we fix the input size as 32 for each word as in Reference [63]. Six convolutional filters with kernel sizes of 1, 2, 3, 4, 5 and 7 were used.…”
Section: Experiments and Resultsmentioning
confidence: 99%
“…Many studies focus on the encoder portion of the model. Approaches include using convolutional neural networks (CNNs) [42], biLSTM's [43,44], and combinations of the two [40,45,46].…”
Section: Sequence Taggingmentioning
confidence: 99%