2017
DOI: 10.1017/s1351324917000195
|View full text |Cite
|
Sign up to set email alerts
|

Leveraging bilingual terminology to improve machine translation in a CAT environment

Abstract: This work focuses on the extraction and integration of automatically aligned bilingual terminology into a Statistical Machine Translation (SMT) system in a Computer Aided Translation scenario. We evaluate the proposed framework that, taking as input a small set of parallel documents, gathers domain-specific bilingual terms and injects them into an SMT system to enhance translation quality. Therefore, we investigate several strategies to extract and align terminology across languages and to integrate it in an S… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
15
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
8
1

Relationship

0
9

Authors

Journals

citations
Cited by 22 publications
(17 citation statements)
references
References 23 publications
0
15
0
Order By: Relevance
“…Inter alia, CLs are illustrated by the rules of the Simplified Technical English (STE). The study "Leveraging bilingual terminology to improve machine translation in a CAT environment" (Arcan et al, 2017) deals with the extraction and integration of automatically aligned bilingual terminology into a Statistical Machine Translation (SMT) system in a Computer Aided Translation scenario. The information technology nomenclature is compared and tested in English, Italian and German and draws the conclusion of the significant improvement of the system results, especially compared to the widely-used XML markup approach.…”
Section: Literature Review and Discussionmentioning
confidence: 99%
“…Inter alia, CLs are illustrated by the rules of the Simplified Technical English (STE). The study "Leveraging bilingual terminology to improve machine translation in a CAT environment" (Arcan et al, 2017) deals with the extraction and integration of automatically aligned bilingual terminology into a Statistical Machine Translation (SMT) system in a Computer Aided Translation scenario. The information technology nomenclature is compared and tested in English, Italian and German and draws the conclusion of the significant improvement of the system results, especially compared to the widely-used XML markup approach.…”
Section: Literature Review and Discussionmentioning
confidence: 99%
“…Besides monolingual term extraction, we also followed a different approach when it comes to bilingual term extraction [72,73]. We first perform monolingual extraction of domain-specific terms, using available terminology extractors, and then, given a source term and a parallel sentence pair in which it appears, a set of possible translations are obtained.…”
Section: Discussionmentioning
confidence: 99%
“…We first perform monolingual extraction of domain-specific terms, using available terminology extractors, and then, given a source term and a parallel sentence pair in which it appears, a set of possible translations are obtained. There are different options: to use automatic translation, trained on the same corpus using GIZA++ [40,43], to apply a word aligner [72], or to use log-likelihood comparison and phrase-based statistical machine translation models as in TermFinder [73]. We rely on previous research [27,39,40] that proved successful for bilingual term extraction in other domains, where one language is Serbian.…”
Section: Discussionmentioning
confidence: 99%
“…points, and a significant improvement from 0.05 to 3.03 in the X.M.L. markup method [15]. Passban et al (2017) proposed two different methods to associate complex words with complete sentences in multiple words or even simpler languages in the S.M.T.…”
Section: Literature Reviewmentioning
confidence: 99%