2016
DOI: 10.1007/s10579-016-9369-0
|View full text |Cite
|
Sign up to set email alerts
|

Large aligned treebanks for syntax-based machine translation

Abstract: We present a collection of parallel treebanks that have been automatically aligned on both the terminal and the nonterminal constituent level for use in syntax-based machine translation. We describe how they were constructed and applied to a syntax-and example-based machine translation system called Parse and Corpus-Based Machine Translation (PaCo-MT). For the language pair Dutch to English, we present evaluation scores of both the nonterminal constituent alignments and the MT system itself, and in the latter … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
7
0

Year Published

2017
2017
2022
2022

Publication Types

Select...
2
2
1
1

Relationship

2
4

Authors

Journals

citations
Cited by 6 publications
(7 citation statements)
references
References 22 publications
0
7
0
Order By: Relevance
“…Parallel treebanks [21] are syntactically annotated versions of parallel corpora. While the latter are traditionally used in data-driven MT systems, such as phrase-based SMT or NMT [22], parallel Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 25 April 2019 doi:10.20944/preprints201904.0274.v1…”
Section: Parallel Treebanksmentioning
confidence: 99%
“…Parallel treebanks [21] are syntactically annotated versions of parallel corpora. While the latter are traditionally used in data-driven MT systems, such as phrase-based SMT or NMT [22], parallel Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 25 April 2019 doi:10.20944/preprints201904.0274.v1…”
Section: Parallel Treebanksmentioning
confidence: 99%
“…Parallel treebanks [13] are syntactically annotated versions of parallel corpora. While the latter are traditionally used in data-driven MT systems, such as phrase-based SMT or NMT [14], parallel treebanks can be used to improve syntax-based statistical MT ( [15,16]) by taking advantage of linguistic information, allowing higher levels of abstraction than in phrase-based SMT.…”
Section: Building Resources For Syntax-based Translationmentioning
confidence: 99%
“…Tree alignment leads to parallel treebanks. In other words, such treebanks [13] are syntactically annotated versions of parallel corpora.…”
Section: Sub-sentential Alignmentmentioning
confidence: 99%
See 1 more Smart Citation
“…The Europarl parallel treebank We have made an update of the treebank described in Kotzé et al (2016): we used the data from Europarl version 7 (Koehn, 2005) and extracted the Dutch and English sentence-aligned data from www.statmt.org. The Dutch side was parsed with the Alpino parser and the English side with the Stanford parser (Klein and Manning, 2003) with added dependencies (de Marne e et al, 2006).…”
mentioning
confidence: 99%