Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing 2018
DOI: 10.18653/v1/d18-1531
|View full text |Cite
|
Sign up to set email alerts
|

Unsupervised Neural Word Segmentation for Chinese via Segmental Language Modeling

Abstract: Previous traditional approaches to unsupervised Chinese word segmentation (CWS) can be roughly classified into discriminative and generative models. The former uses the carefully designed goodness measures for candidate segmentation, while the latter focuses on finding the optimal segmentation of the highest generative probability. However, while there exists a trivial way to extend the discriminative models into neural version by using neural language models, those of generative ones are non-trivial. In this … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
30
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
3
3
1

Relationship

0
7

Authors

Journals

citations
Cited by 18 publications
(42 citation statements)
references
References 13 publications
0
30
0
Order By: Relevance
“…Another line of techniques have been focused on models that are both strong language models and good for sequence segmentation. Many are in some way based on Connectionist Temporal Classification (Graves et al, 2006), and include Sleep-WAke Networks (Wang et al, 2017), Segmental RNNs (Kong et al, 2016), and Segmental Language Models (Sun and Deng, 2018;Kawakami et al, 2019;Wang et al, 2021;Downey et al, 2021). In this work, we conduct experiments using the Masked Segmental Language Model of Downey et al (2021), due to its good performance and scalability, the latter usually regarded as an obligatory feature of crosslingual models (Conneau et al, 2020a;Xue et al, 2021, inter alia).…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…Another line of techniques have been focused on models that are both strong language models and good for sequence segmentation. Many are in some way based on Connectionist Temporal Classification (Graves et al, 2006), and include Sleep-WAke Networks (Wang et al, 2017), Segmental RNNs (Kong et al, 2016), and Segmental Language Models (Sun and Deng, 2018;Kawakami et al, 2019;Wang et al, 2021;Downey et al, 2021). In this work, we conduct experiments using the Masked Segmental Language Model of Downey et al (2021), due to its good performance and scalability, the latter usually regarded as an obligatory feature of crosslingual models (Conneau et al, 2020a;Xue et al, 2021, inter alia).…”
Section: Related Workmentioning
confidence: 99%
“…The sentences originally came in a train/validation/test split, but because goldsegmented sentences are so rare, we concatenate these sets and then split them in half into final validation and test sets. MSLMs An MSLM is a variant of a Segmental Language Model (SLM) (Sun and Deng, 2018;Kawakami et al, 2019;Wang et al, 2021), which takes as input a sequence of characters x and outputs a probability distribution for a sequence of segments y such that the concatenation of the segments of y is equivalent to x: π(y) = x. An MSLM is composed of a Segmental Transformer Encoder and an LSTM-based Segment Decoder (Downey et al, 2021).…”
Section: Data and Pre-processingmentioning
confidence: 99%
See 1 more Smart Citation
“…To solve this problem we propose a type of Segmental Language Model (Sun and Deng, 2018;Kawakami et al, 2019), based on the powerful neural Transformer architecture (Vaswani et al, 2017).…”
Section: Introductionmentioning
confidence: 99%
“…Nearperfect supervised methods have been developed for use in resource-rich languages such as Chinese, but many of the world's languages are both morphologically complex, and have no large dataset of "gold" segmentations into meaningful units. To solve this problem, we propose a new type of Segmental Language Model (Sun and Deng, 2018;Kawakami et al, 2019;Wang et al, 2021), for use in both unsupervised and lightly supervised segmentation tasks. We introduce a Masked Segmental Language Model (MSLM) built on a spanmasking transformer architecture, harnessing the power of a bi-directional masked modeling context and attention.…”
mentioning
confidence: 99%