Syntactically Guided Neural Machine Translation

Stahlberg, Felix; Hasler, Eva; Waite, Aurelien; Byrne, Bill

doi:10.18653/v1/p16-2049

Cited by 57 publications

(47 citation statements)

References 23 publications

(27 reference statements)

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Although Bahdanau et al (2015) used a bidirectional recurrent neural network (RNN) (Schuster and Paliwal, 1997) to consider preceding and following words jointly, these sequential representations are insufficient to fully capture the semantics of a sentence, due to the fact that they do not account for the syntactic interpretations of sentence structure (Eriguchi et al, 2016;Tai et al, 2015). By incorporating additional features into a sequential model, and Stahlberg et al (2016) suggest that a greater amount of linguistic information can improve the translation performance.…”

Section: Introductionmentioning

confidence: 99%

Towards Bidirectional Hierarchical Representations for Attention-based Neural Machine Translation

Yang

Wong

Xiao

et al. 2017

Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

View full text Add to dashboard Cite

This paper proposes a hierarchical attentional neural translation model which focuses on enhancing source-side hierarchical representations by covering both local and global semantic information using a bidirectional tree-based encoder. To maximize the predictive likelihood of target words, a weighted variant of an attention mechanism is used to balance the attentive information between lexical and phrase vectors. Using a tree-based rare word encoding, the proposed model is extended to sub-word level to alleviate the out-of-vocabulary (OOV) problem. Empirical results reveal that the proposed model significantly outperforms sequence-to-sequence attention-based and tree-based neural translation models in English-Chinese translation tasks.

show abstract

Section: Introductionmentioning

confidence: 99%

Towards Bidirectional Hierarchical Representations for Attention-based Neural Machine Translation

Yang

Wong

Xiao

et al. 2017

Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

View full text Add to dashboard Cite

show abstract

“…Some of these solutions lead to improvements in performance, but they all require time-intensive training of the NMT models to use an enriched input representation or to optimize the parameters of the model. (Stahlberg et al, 2016) proposed an approach which can be used at decoding time. A hierarchical PBSMT system is used to generate the translation lattices, which are then re-scored by the NMT decoder.…”

Section: Related Workmentioning

confidence: 99%

Guiding Neural Machine Translation Decoding with External Knowledge

Chatterjee¹,

Negri²,

Turchi³

et al. 2017

Proceedings of the Second Conference on Machine Translation

View full text Add to dashboard Cite

Differently from the phrase-based paradigm, neural machine translation (NMT) operates on word and sentence representations in a continuous space. This makes the decoding process not only more difficult to interpret, but also harder to influence with external knowledge. For the latter problem, effective solutions like the XML-markup used by phrase-based models to inject fixed translation options as constraints at decoding time are not yet available. We propose a "guide" mechanism that enhances an existing NMT decoder with the ability to prioritize and adequately handle translation options presented in the form of XML annotations of source words. Positive results obtained in two different translation tasks indicate the effectiveness of our approach.

show abstract

“…Regarding handling OOV words, Jean et al (2015) presented an efficient training method to support a larger vocabulary, which helps alleviate the OOV problem significantly. Stahlberg et al (2016) used SMT to produce candidate results in the form of lattice and NMT to re-score the results. As SMT uses a larger vocabulary than NMT, some OOV words can be retained.…”

Section: Related Workmentioning

confidence: 99%

Memory-augmented Neural Machine Translation

Feng¹,

Zhang²,

Zhang³

et al. 2017

Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

View full text Add to dashboard Cite

Neural machine translation (NMT) has achieved notable success in recent times, however it is also widely recognized that this approach has limitations with handling infrequent words and word pairs. This paper presents a novel memoryaugmented NMT (M-NMT) architecture, which stores knowledge about how words (usually infrequently encountered ones) should be translated in a memory and then utilizes them to assist the neural model. We use this memory mechanism to combine the knowledge learned from a conventional statistical machine translation system and the rules learned by an NMT system, and also propose a solution for out-of-vocabulary (OOV) words based on this framework. Our experiments on two Chinese-English translation tasks demonstrated that the M-NMT architecture outperformed the NMT baseline by 9.0 and 2.7 BLEU points on the two tasks, respectively. Additionally, we found this architecture resulted in a much more effective OOV treatment compared to competitive methods.

show abstract

Syntactically Guided Neural Machine Translation

Cited by 57 publications

References 23 publications

Towards Bidirectional Hierarchical Representations for Attention-based Neural Machine Translation

Towards Bidirectional Hierarchical Representations for Attention-based Neural Machine Translation

Guiding Neural Machine Translation Decoding with External Knowledge

Memory-augmented Neural Machine Translation

Contact Info

Product

Resources

About