Multi-Granularity Chinese Word Embedding

Yin, Rongchao; Wang, Quan; Li, Peng; Li, Rui; Wang, Bin

doi:10.18653/v1/d16-1100

Cited by 90 publications

(82 citation statements)

References 13 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Multi Granularity Representation Multigranularity representation, which is proposed to make full use of subunit composition at different levels of granularity, has been explored in various NLP tasks, such as paraphrase identification (Yin and Schütze, 2015), Chinese word embedding learning (Yin et al, 2016), universal sentence encoding and machine translation (Nguyen and Joty, 2018;Li et al, 2019b). The major difference between our work and Nguyen and Joty (2018); Li et al (2019b) lies in that we successfully introduce syntactic information into our multi-granularity representation.…”

Section: Related Workmentioning

confidence: 99%

Multi-Granularity Self-Attention for Neural Machine Translation

Hao¹,

Wang²,

Shi³

et al. 2019

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conferen

View full text Add to dashboard Cite

Current state-of-the-art neural machine translation (NMT) uses a deep multi-head selfattention network with no explicit phrase information. However, prior work on statistical machine translation has shown that extending the basic translation unit from words to phrases has produced substantial improvements, suggesting the possibility of improving NMT performance from explicit modeling of phrases. In this work, we present multi-granularity self-attention (MG-SA): a neural network that combines multi-head selfattention and phrase modeling. Specifically, we train several attention heads to attend to phrases in either n-gram or syntactic formalism. Moreover, we exploit interactions among phrases to enhance the strength of structure modeling -a commonly-cited weakness of self-attention. Experimental results on WMT14 English-to-German and NIST Chinese-to-English translation tasks show the proposed approach consistently improves performance. Targeted linguistic analysis reveals that MG-SA indeed captures useful phrase information at various levels of granularities.

show abstract

Section: Related Workmentioning

confidence: 99%

Multi-Granularity Self-Attention for Neural Machine Translation

Hao¹,

Wang²,

Shi³

et al. 2019

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conferen

View full text Add to dashboard Cite

show abstract

“…Based on CBOW and CWE, (Yin et al, 2016) proposed MGE, which predicts target word with its radical embeddings and modified word embeddings of context in CWE, as shown in Fig.5 (b).…”

Section: Multi-granularity Embedding (Mge)mentioning

confidence: 99%

“…The glyph features improved CWE on WordSim-240 and SimLex-999, but not WordSim-296. As for MGE results, we were not able to reproduce the performance in (Yin et al, 2016). We list possible reasons as below: we did not separate non-compositional word during training (character and radical embeddings are not used for these words), and the we created character-radical index from different data source.…”

Section: Word Similaritymentioning

confidence: 99%

Learning Chinese Word Representations From Glyphs Of Characters

Su¹,

Lee²

2017

Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

View full text Add to dashboard Cite

In this paper, we propose new methods to learn Chinese word representations. Chinese characters are composed of graphical components, which carry rich semantics. It is common for a Chinese learner to comprehend the meaning of a word from these graphical components. As a result, we propose models that enhance word representations by character glyphs. The character glyph features are directly learned from the bitmaps of characters by convolutional auto-encoder(convAE), and the glyph features improve Chinese word representations which are already enhanced by character embeddings. Another contribution in this paper is that we created several evaluation datasets in traditional Chinese and made them public.

show abstract

“…We use THULAC 3 (Sun et al, 2016b) for Chinese word segmentation and POS tagging. We identify all entity names for CWE (Chen et al, 2015) and MGE (Yin et al, 2016) as they do not use the characters information for non-compositional words. Our model (JWE) does not use such a non-compositional word list.…”

Section: Experimental Settings Training Corpusmentioning

confidence: 99%

Joint Embeddings of Chinese Words, Characters, and Fine-grained Subcharacter Components

Yu¹,

Jian²,

Hao³

et al. 2017

Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

107

View full text Add to dashboard Cite

Word embeddings have attracted much attention recently. Different from alphabetic writing systems, Chinese characters are often composed of subcharacter components which are also semantically informative. In this work, we propose an approach to jointly embed Chinese words as well as their characters and fine-grained subcharacter components. We use three likelihoods to evaluate whether the context words, characters, and components can predict the current target word, and collected 13,253 subcharacter components to demonstrate the existing approaches of decomposing Chinese characters are not enough. Evaluation on both word similarity and word analogy tasks demonstrates the superior performance of our model.

show abstract

Multi-Granularity Chinese Word Embedding

Cited by 90 publications

References 13 publications

Multi-Granularity Self-Attention for Neural Machine Translation

Multi-Granularity Self-Attention for Neural Machine Translation

Learning Chinese Word Representations From Glyphs Of Characters

Joint Embeddings of Chinese Words, Characters, and Fine-grained Subcharacter Components

Contact Info

Product

Resources

About