2016
DOI: 10.1613/jair.4761
|View full text |Cite
|
Sign up to set email alerts
|

Integrating Rules and Dictionaries from Shallow-Transfer Machine Translation into Phrase-Based Statistical Machine Translation

Abstract: We describe a hybridisation strategy whose objective is to integrate linguistic resources from shallow-transfer rule-based machine translation (RBMT) into phrase-based statistical machine translation (PBSMT). It basically consists of enriching the phrase table of a PBSMT system with bilingual phrase pairs matching transfer rules and dictionary entries from a shallow-transfer RBMT system. This new strategy takes advantage of how the linguistic resources are used by the RBMT system to segment the source-language… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2016
2016
2023
2023

Publication Types

Select...
3
2
1

Relationship

2
4

Authors

Journals

citations
Cited by 6 publications
(3 citation statements)
references
References 30 publications
0
3
0
Order By: Relevance
“…Linguistic data from RBMT systems have already been used to enrich SMT systems (Tyers, 2009;Schwenk et al, 2009;Eisele et al, 2008;Sánchez-Cartagena et al, 2011a). We have already proved that using hand-written rules and dictionaries from RBMT yields better results than using only dictionaries (Sánchez-Cartagena et al, 2011a).…”
Section: Related Workmentioning
confidence: 93%
“…Linguistic data from RBMT systems have already been used to enrich SMT systems (Tyers, 2009;Schwenk et al, 2009;Eisele et al, 2008;Sánchez-Cartagena et al, 2011a). We have already proved that using hand-written rules and dictionaries from RBMT yields better results than using only dictionaries (Sánchez-Cartagena et al, 2011a).…”
Section: Related Workmentioning
confidence: 93%
“…Step 1 in our work aims at extracting the phrases in the sentence. The commonly used methods for this task are usually based on graphs [ 29 , 30 ] or probabilistics [ 31 , 32 ]. Inspiring as they are, the results of these methods are still not satisfying enough for application; accuracy and recall are problems.…”
Section: Related Workmentioning
confidence: 99%
“…Thus, ruLearn reduces the difficulty of creating Apertiumbased RBMT systems for new language pairs. ruLearn has been successfully used in the development of Apertium-based RBMT systems for Chinese→Spanish (Costa-Jussà and Centelles, 2015) and Serbian↔Croatian (Klubička et al, 2016) The rules obtained with ruLearn can also be integrated into a phrase-based SMT system by means of the hybridisation strategy we developed (Sánchez-Cartagena et al, 2016) and released as an open-source toolkit (Sánchez-Cartagena et al, 2012). When shallow-transfer rules extracted from the same training corpus are integrated into a phrase-based SMT system, the translation knowledge contained in the parallel corpus is generalised to sequences of words that have not been observed in the corpus, thus helping to mitigate data sparseness.…”
Section: Introductionmentioning
confidence: 99%