Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing 2015
DOI: 10.18653/v1/d15-1276
|View full text |Cite
|
Sign up to set email alerts
|

Morphological Analysis for Unsegmented Languages using Recurrent Neural Network Language Model

Abstract: We present a new morphological analysis model that considers semantic plausibility of word sequences by using a recurrent neural network language model (RNNLM). In unsegmented languages, since language models are learned from automatically segmented texts and inevitably contain errors, it is not apparent that conventional language models contribute to morphological analysis. To solve this problem, we do not use language models based on raw word sequences but use a semantically generalized language model, RNNLM… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
48
1
1

Year Published

2016
2016
2023
2023

Publication Types

Select...
4
3
2

Relationship

1
8

Authors

Journals

citations
Cited by 77 publications
(50 citation statements)
references
References 10 publications
(12 reference statements)
0
48
1
1
Order By: Relevance
“…In contrast, we leverage both character embeddings and word embeddings for better accuracies. (Morita et al, 2015;Liu et al, 2016;Cai and Zhao, 2016), which are different from our work in the basic framework. For instance, Liu et al (2016) follow Andrew (2006) using a semi-CRF for structured inference.…”
Section: Error Analysiscontrasting
confidence: 62%
“…In contrast, we leverage both character embeddings and word embeddings for better accuracies. (Morita et al, 2015;Liu et al, 2016;Cai and Zhao, 2016), which are different from our work in the basic framework. For instance, Liu et al (2016) follow Andrew (2006) using a semi-CRF for structured inference.…”
Section: Error Analysiscontrasting
confidence: 62%
“…角(corner)" (Zheng et al, 2013), which is infeasible by using sparse one-hot character features. In addition to character embeddings, distributed representations of character bigrams Pei et al, 2014) and words (Morita et al, 2015;Zhang et al, 2016b) have also been shown to improve segmentation accuracies.…”
Section: Introductionmentioning
confidence: 99%
“…With respect to non-linear modeling power, various network structures have been exploited to represent contexts for segmentation disambiguation, including multi-layer perceptrons on fivecharacter windows (Zheng et al, 2013;Pei et al, 2014;Chen et al, 2015a), as well as LSTMs on characters (Chen et al, 2015b;Xu and Sun, 2016) and words (Morita et al, 2015;Cai and Zhao, 2016;Zhang et al, 2016b). For structured learning and inference, CRF has been used for character sequence labelling models (Pei et al, 2014;Chen et al, 2015b) and structural beam search has been used for word-based segmentors (Cai and Zhao, 2016;Zhang et al, 2016b).…”
Section: Introductionmentioning
confidence: 99%
“…Unfortunately, such large-scale data is not available for many lesser-studied languages, including Ainu. For Japanese and Chinese, word segmentation is sometimes modelled jointly with part-of-speech tagging, as the output of the latter task can provide useful information to the segmenter [21,[28][29][30].…”
Section: Related Workmentioning
confidence: 99%