A Neural Approach to Source Dependence Based Context Model for Statistical Machine Translation

et al. 2019

SIP

237

107

Extensive evaluation on a large number of word embedding models for language processing applications is conducted in this work. First, we introduce popular word embedding models and discuss desired properties of word models and evaluation methods (or evaluators). Then, we categorize evaluators into intrinsic and extrinsic two types. Intrinsic evaluators test the quality of a representation independent of specific natural language processing tasks while extrinsic evaluators use word embeddings as input features to a downstream task and measure changes in performance metrics specific to that task. We report experimental results of intrinsic and extrinsic evaluators on six word embedding models. It is shown that different evaluators focus on different aspects of word models, and some are more correlated with natural language processing tasks. Finally, we adopt correlation analysis to study performance consistency of extrinsic and intrinsic evalutors.

Section: Introductionmentioning

confidence: 99%

Evaluating word embedding models: methods and experimental results

Wang

Wang²,

et al. 2019

SIP

237

107

“…Their work was further advanced in [5], [6], which encoded the entire source sentence. Chen et al [38], [43] flatten local dependency structures including parent, children, and sibling words into a word tuple, which is represeted as a compositonal vector by a convotional NN to model the long-distance dependency constraints. In spite of their success, their approaches center around capturing relations between each word and its local dependency words.…”

Section: Discussionmentioning

confidence: 99%

“…For example, the probability is determined with four target history words and eleven source-side local context words. • DNNJM: an DBiCS-based NN joint model [38] flatten local dependency structures including parent, children, and sibling words into a word tuple, which is represeted as a vector representation by a convotional NN to model the long-distance dependency constraints. For example, the probability is determined with a source dependency context which includes ngram dependency-based bilingual context units.…”

Section: Baseline Systemmentioning

confidence: 99%

Syntax-Based Context Representation for Statistical Machine Translation

IEICE Trans. Inf. & Syst.

Zhao

Yang

2018

Self Cite

Learning semantic representation for translation context is beneficial to statistical machine translation (SMT). Previous efforts have focused on implicitly encoding syntactic and semantic knowledge in translation context by neural networks, which are weak in capturing explicit structural syntax information. In this paper, we propose a new neural network with a tree-based convolutional architecture to explicitly learn structural syntax information in translation context, thus improving translation prediction. Specifically, we first convert parallel sentences with source parse trees into syntax-based linear sequences based on a minimum syntax subtree algorithm, and then define a tree-based convolutional network over the linear sequences to learn syntax-based context representation and translation prediction jointly. To verify the effectiveness, the proposed model is integrated into phrase-based SMT. Experiments on large-scale Chinese-to-English and German-to-English translation tasks show that the proposed approach can achieve a substantial and significant improvement over several baseline systems. key words: syntax context representation, tree-based neural network, translation prediction, statistical machine translation † The "spurious" indicates that the source word is aligned to "NULL" of target sentence.

“…The supervised BWE (Mikolov et al, 2013), which exploits similarities between the source language and the target language through a linear transformation matrix, serves as the basis for many NLP tasks, such as machine translation (Bahdanau et al, 2015;Vaswani et al, 2017;Chen et al, 2018b;, dependency parsing , semantic role labeling Li et al, 2019). However, the lack of a large wordpair dictionary poses a major practical problem for many language pairs.…”

Section: Related Workmentioning

confidence: 99%

Unsupervised Bilingual Word Embedding Agreement for Unsupervised Neural Machine Translation

Sun

Wang

Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

et al. 2019

Self Cite

Unsupervised bilingual word embedding (UBWE), together with other technologies such as back-translation and denoising, has helped unsupervised neural machine translation (UNMT) achieve remarkable results in several language pairs. In previous methods, UBWE is first trained using nonparallel monolingual corpora and then this pre-trained UBWE is used to initialize the word embedding in the encoder and decoder of UNMT. That is, the training of UBWE and UNMT are separate. In this paper, we first empirically investigate the relationship between UBWE and UNMT. The empirical findings show that the performance of UNMT is significantly affected by the performance of UBWE. Thus, we propose two methods that train UNMT with UBWE agreement. Empirical results on several language pairs show that the proposed methods significantly outperform conventional UNMT.