Proceedings of the 18th Conference on Computational Linguistics - 2000
DOI: 10.3115/992730.992736
|View full text |Cite
|
Sign up to set email alerts
|

Multi-level similar segment matching algorithm for translation memories and Example-based Machine Translation

Abstract: We propose a dynamic programming algorithm for calculaing the similarity between two segmeuts of words of the same language. The similarity is considered as a vector whose coordinates refer to the levels of analysis of the segments. This algorithm is extremely efficient for retrieving the best example in Translation Memory systems. The calculus being constructive, it also gives the correspondences between the words of the two segments. This allows the extension of Translation Memory systems towards Example-bas… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0

Year Published

2006
2006
2014
2014

Publication Types

Select...
3
2
2

Relationship

0
7

Authors

Journals

citations
Cited by 10 publications
(6 citation statements)
references
References 9 publications
0
6
0
Order By: Relevance
“…The semantic tagset used by USAS is a languageindependent multi-tier structure with 21 major discourse fields, subdivided into 232 sub-categories (such as I1.1-= Money: lack; A5.1-= Evaluation: bad), which can be used to detect the semantic context. Identification of semantically similar situations can be also improved by the use of segment-matching algorithms as employed in Example-Based MT (EBMT) and translation memories (Planas and Furuse, 2000;Carl and Way, 2003).…”
Section: Discussionmentioning
confidence: 99%
“…The semantic tagset used by USAS is a languageindependent multi-tier structure with 21 major discourse fields, subdivided into 232 sub-categories (such as I1.1-= Money: lack; A5.1-= Evaluation: bad), which can be used to detect the semantic context. Identification of semantically similar situations can be also improved by the use of segment-matching algorithms as employed in Example-Based MT (EBMT) and translation memories (Planas and Furuse, 2000;Carl and Way, 2003).…”
Section: Discussionmentioning
confidence: 99%
“…Further, in word-based measures, the exploitation of the information provided by the order of the words can be fundamental: word-order-sensitive approaches are demonstrated generally to outperform bag-of-words methods (Baldwin and Tanaka 2000). For instance, several papers (Collins and Cunningham 1996;Cranias et al 1994;Planas and Furuse 2000) report advanced hybrid word-and structure-based matching techniques on the TM. In particular, Collins and Cunningham (1996) and Cranias et al (1994) perform exact matching whereas Planas and Furuse (2000) adopt the edit-distance metric limiting themselves to deletions and equalities.…”
Section: Similarity Measures and Scoring Functionsmentioning
confidence: 99%
“…For instance the approach in Brown (1996) and Leplus et al (2004) stores examples as strings of words together with some alignment and information on equivalence classes (numbers, weekdays, etc.). Other approaches (Cranias et al 1994;Collins and Cunningham 1996;Planas and Furuse 2000) are more language-and knowledge-dependent and perform some text operations (e.g. POS tagging and stemming) on the sentences in order to store more structural information about them.…”
Section: Logical Representation Of Examplesmentioning
confidence: 99%
“…This involves an extension of the English semantic tagger, development of the Russian tagger with the target lexical coverage of 90% of source texts, designing the procedure for retrieval of semantically similar situations and completing the user interface. Identification of semantically similar situations can be improved by the use of segmentmatching algorithms as employed in ExampleBased MT and translation memories (Planas and Furuse, 2000;Carl and Way, 2003).…”
Section: Discussionmentioning
confidence: 99%
“…It is widely acknowledged that human translators can benefit from a wide range of applications in computational linguistics, including Machine Translation (Carl and Way, 2003), Translation Memory (Planas and Furuse, 2000), etc. There have been recent research on tools detecting translation equivalents for technical vocabulary in a restricted domain, e.g.…”
Section: Introductionmentioning
confidence: 99%