2017
DOI: 10.1515/pralin-2017-0011
|View full text |Cite
|
Sign up to set email alerts
|

Using Word Embeddings to Enforce Document-Level Lexical Consistency in Machine Translation

Abstract: We integrate new mechanisms in a document-level machine translation decoder to improve the lexical consistency of document translations. First, we develop a document-level feature designed to score the lexical consistency of a translation. This feature, which applies to words that have been translated into different forms within the document, uses word embeddings to measure the adequacy of each word translation given its context. Second, we extend the decoder with a new stochastic mechanism that, at translatio… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
4
0

Year Published

2017
2017
2021
2021

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 16 publications
(8 citation statements)
references
References 4 publications
0
4
0
Order By: Relevance
“…is word segmentation method is not efficient, but its proposal lays the foundation for Chinese automatic word segmentation technology [7]. Relevant scholars have theorized the Chinese word segmentation method and proposed the "minimum number of words" segmentation theory; that is, each sentence should be segmented with the least number of words [8]. is word segmentation method is an improvement on the "word dictionary" word segmentation method, which has promoted the development of Chinese word segmentation technology.…”
Section: Introductionmentioning
confidence: 99%
“…is word segmentation method is not efficient, but its proposal lays the foundation for Chinese automatic word segmentation technology [7]. Relevant scholars have theorized the Chinese word segmentation method and proposed the "minimum number of words" segmentation theory; that is, each sentence should be segmented with the least number of words [8]. is word segmentation method is an improvement on the "word dictionary" word segmentation method, which has promoted the development of Chinese word segmentation technology.…”
Section: Introductionmentioning
confidence: 99%
“…Another line of document-level NMT work (Xiong et al, 2018;Voita et al, 2019b) proposed a twopass document decoding model inspired by the deliberation network (Xia et al, 2017) in order to incorporate target side document context. A parallel line of work (Garcia et al, 2017(Garcia et al, , 2019Yu et al, 2019) introduced document-level approaches that do not require training the context-conditional NMT model by introducing a separate language model to enforce the consistency in the outputs of sentence-level NMT model. Garcia et al (2019) used a simple n-gram based semantic space language model (Hardmeier et al, 2012) to re-rank the outputs of the sentence-level NMT model inside the beam-search algorithm to enforce documentlevel consistency.…”
Section: Related Workmentioning
confidence: 99%
“…Xiong et al [29] propose to learn the topic structure of source document and then map the structure to the target translation. In addition to these approaches leveraging discourse-level linguistic features for document translation, Garcia et al [4] incorporate new word embedding features into decoder to improve the lexical consistency of translations. Document-level NMT In the context of neural machine translation, previous studies first incorporate contextual information into NMT models built on RNN networks.…”
Section: Related Workmentioning
confidence: 99%
“…For English-German translation, we used the WMT19 bilingual document-level training data 5 , which contains 39k documents with 3 http://nlp.nju.edu.cn/cwmt-wmt. 4 https://www.sogou.com/labs/resource/list news.php. 5 https://s3-eu-west-1.amazonaws.com/tilde-model/rapid2019.de-en.zip.…”
Section: Experimental Settingmentioning
confidence: 99%