Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Lang 2003
DOI: 10.3115/1073483.1073499
|View full text |Cite
|
Sign up to set email alerts
|

Cognates can improve statistical translation models

Abstract: We report results of experiments aimed at improving the translation quality by incorporating the cognate information into translation models. The results confirm that the cognate identification approach can improve the quality of word alignment in bitexts without the need for extra resources.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
70
0

Year Published

2007
2007
2023
2023

Publication Types

Select...
4
3
3

Relationship

1
9

Authors

Journals

citations
Cited by 74 publications
(70 citation statements)
references
References 5 publications
0
70
0
Order By: Relevance
“…Cognate word matching has been shown to facilitate the extraction of translation lexicons from comparable corpora (Melamed 1995;Koehn and Knight 2002;Kondrak, Marcu, and Knight 2003;Fišer and Ljubešić 2011;Fišer and Sagot 2015). In this area, a large number of similarity measures have been developed (Tiedemann 1999;Kondrak and Dorr 2004;Kondrak and Sherif 2006), and cognate generation models based on such similarity measures, stochastic transducers or HMMs have been introduced, e.g., Mann and Yarowsky (2001) for closely related languages or Scherrer (2007) for dialects.…”
Section: Automatic Modernisation Of Historical Languagementioning
confidence: 99%
“…Cognate word matching has been shown to facilitate the extraction of translation lexicons from comparable corpora (Melamed 1995;Koehn and Knight 2002;Kondrak, Marcu, and Knight 2003;Fišer and Ljubešić 2011;Fišer and Sagot 2015). In this area, a large number of similarity measures have been developed (Tiedemann 1999;Kondrak and Dorr 2004;Kondrak and Sherif 2006), and cognate generation models based on such similarity measures, stochastic transducers or HMMs have been introduced, e.g., Mann and Yarowsky (2001) for closely related languages or Scherrer (2007) for dialects.…”
Section: Automatic Modernisation Of Historical Languagementioning
confidence: 99%
“…Use of non-parallel data: Cognates can be extracted from monolingual data and used as a parallel lexicon (Hana et al, 2006;Mann and Yarowsky, 2001;Kondrak et al, 2003). However, our task is whole-text transformation, not just cognate extraction.…”
Section: Related Workmentioning
confidence: 99%
“…where LCS(s 1 ,s 2 ) is the longest common subsequence of strings s 1 and s 2 and |s| is the length of string s. Following Kondrak et al (2003), we retained only pairs of entities with LCSR > 0.58, a value they found to be useful for cognate extraction in many language pairs. We also retained pairs without spelling differences, i.e.…”
Section: Cognate Pair Extraction From Wikipedia Titlesmentioning
confidence: 99%