Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conferen 2019
DOI: 10.18653/v1/d19-1072
|View full text |Cite
|
Sign up to set email alerts
|

Latent Part-of-Speech Sequences for Neural Machine Translation

Abstract: Learning target side syntactic structure has been shown to improve Neural Machine Translation (NMT). However, incorporating syntax through latent variables introduces additional complexity in inference, as the models need to marginalize over the latent syntactic structures. To avoid this, models often resort to greedy search which only allows them to explore a limited portion of the latent space.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
13
0
1

Year Published

2020
2020
2022
2022

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 12 publications
(15 citation statements)
references
References 27 publications
(38 reference statements)
1
13
0
1
Order By: Relevance
“…It could also explain the results for the large-scale WMT data, where only recurrent systems were able to take advantage of the linguistic annotations. This hypothesis is also compatible with the results reported on WMT data by Yang et al (2019), who successfully leveraged TL linguistic annotations in Transformer systems using an ad-hoc architecture.…”
Section: Error Analysissupporting
confidence: 90%
See 1 more Smart Citation
“…It could also explain the results for the large-scale WMT data, where only recurrent systems were able to take advantage of the linguistic annotations. This hypothesis is also compatible with the results reported on WMT data by Yang et al (2019), who successfully leveraged TL linguistic annotations in Transformer systems using an ad-hoc architecture.…”
Section: Error Analysissupporting
confidence: 90%
“…The literature mainly contains incomplete evidence. For instance, Yang et al (2019) conclude that TL part-of-speech annotations boost translation quality with an ad-hoc architecture, but Wagner (2017) claims that TL morpho-syntactic description tags degrade translation quality when they are interleaved: it is not clear whether the difference between both results is caused by the type of linguistic annotations or by the approach followed to integrate them. There are also contradictory results, such as those reported by Tamchyna et al (2017), who claim that TL annotations are only useful when they are combined with lemmatisation, and Nadejde et al (2017), who report positive results without lemmatisation.…”
Section: Introductionmentioning
confidence: 99%
“…Categorical information has achieved great success in neural machine translation, such as partof-speech (POS) tag in autoregressive translation (Yang et al, 2019) and syntactic label in nonautoregressive translation (Akoury et al, 2019). Inspired by the broad application of categorical information, we propose to model the implicit categorical information of target words in a nonautoregressive Transformer.…”
Section: Modeling Target Categorical Information By Vector Quantizationmentioning
confidence: 99%
“…Niehues and Cho apply multi-task learning where the encoder of the NMT model is trained to produce multiple tasks such as POS tagging and named-entity recognition into NMT models [15]. There are also works that directly model the syntax of the target sentence during decoding [22][23][24].…”
Section: Related Workmentioning
confidence: 99%