Long Short-Term Memory Neural Networks for Chinese Word Segmentation

Chen, Xinchi; Qiu, Xipeng; Zhu, Chenxi; Liu, Pengfei; Huang, Xuanjing

doi:10.18653/v1/d15-1141

Cited by 255 publications

(239 citation statements)

References 29 publications

Supporting

Mentioning

237

Contrasting

Order By: Relevance

“…We replace the discrete word and character features of Zhang and Clark (2007) with word and character embeddings, respectively, and change their linear model into a deep neural network. Following Zheng et al (2013) and Chen et al (2015b), we use convolution neural networks to achieve local feature combination and LSTM to learn global sentence-level features, respectively. The resulting model is a word-based neural segmenter that can leverage rich embedding features.…”

Section: Introductionmentioning

confidence: 99%

Transition-Based Neural Word Segmentation

Zhang

2016

Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

109

100

View full text Add to dashboard Cite

Character-based and word-based methods are two main types of statistical models for Chinese word segmentation, the former exploiting sequence labeling models over characters and the latter typically exploiting a transition-based model, with the advantages that word-level features can be easily utilized. Neural models have been exploited for character-based Chinese word segmentation, giving high accuracies by making use of external character embeddings, yet requiring less feature engineering. In this paper, we study a neural model for word-based Chinese word segmentation, by replacing the manuallydesigned discrete features with neural features in a word-based segmentation framework. Experimental results demonstrate that word features lead to comparable performances to the best systems in the literature, and a further combination of discrete and neural features gives top accuracies.

show abstract

Section: Introductionmentioning

confidence: 99%

Transition-Based Neural Word Segmentation

Zhang

2016

Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

109

100

View full text Add to dashboard Cite

show abstract

“…However, we employed a different training objective. Chen et al (2015) employed a max-margin objective, however, while they found this objective yielded better results, we observed that maximum-likelihood yielded better segmentation results in our experiments 1 . Additionally, we sought to integrate their model with a logbilinear CRF, which uses a maximum-likelihood training objective.…”

Section: Lstm For Word Segmentationmentioning

confidence: 46%

“…We propose a model that integrates the best Chinese word segmentation system (Chen et al, 2015) using an LSTM neural model that learns representations, with the best NER model for Chinese social media (Peng and Dredze, 2015), that supports training neural representations by a log-bilinear CRF. We begin with a brief review of each system.…”

Section: Modelmentioning

confidence: 99%

See 1 more Smart Citation

Improving Named Entity Recognition for Chinese Social Media with Word Segmentation Representation Learning

Peng

Dredze

2016

Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

189

116

View full text Add to dashboard Cite

Named entity recognition, and other information extraction tasks, frequently use linguistic features such as part of speech tags or chunkings. For languages where word boundaries are not readily identified in text, word segmentation is a key first step to generating features for an NER system. While using word boundary tags as features are helpful, the signals that aid in identifying these boundaries may provide richer information for an NER system. New state-of-the-art word segmentation systems use neural models to learn representations for predicting word boundaries. We show that these same representations, jointly trained with an NER system, yield significant improvements in NER for Chinese social media. In our experiments, jointly training NER and word segmentation with an LSTM-CRF model yields nearly 5% absolute improvement over previously published results.

show abstract

“…Following Chen et al (2015b), a standard bi-LSTM model (Graves, 2008) is used to assign segmentation label for each character. As shown in Figure 1, our model consists of a representation layer and a scoring layer.…”

Section: Concatmentioning

confidence: 99%