Proceedings of the 16th Conference on Computational Linguistics - 1996
DOI: 10.3115/993268.993270
|View full text |Cite
|
Sign up to set email alerts
|

Extraction of lexical translations from non-aligned corpora

Abstract: A method for extracting lexical translations from non-aligned corpora is proposed to cope with the unavailability of large aligned corpus. The assumption that "translations of two co-occurring words in a source language also co-occur in the target language" is adopted and represented in the stochastic matrix formulation. The translation matrix provides the co-occurring information translated from the source into the target. This translated co-occurring information should resemble that of the original in the ta… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
38
0

Year Published

2003
2003
2006
2006

Publication Types

Select...
3
2
2

Relationship

0
7

Authors

Journals

citations
Cited by 57 publications
(38 citation statements)
references
References 5 publications
0
38
0
Order By: Relevance
“…The general idea of exploiting time correlations to acquire word translations from comparable corpora has been explored in several previous studies -e.g., (Fung, 1995;Rapp, 1995;Tanaka and Iwasaki, 1996). Recently, a Pearson correlation method was proposed to mine word pairs from comparable corpora (Tao and Zhai, 2005); this idea is similar to the method used in (Kay and Roscheisen, 1993) for sentence alignment.…”
Section: Previous Workmentioning
confidence: 96%
See 1 more Smart Citation
“…The general idea of exploiting time correlations to acquire word translations from comparable corpora has been explored in several previous studies -e.g., (Fung, 1995;Rapp, 1995;Tanaka and Iwasaki, 1996). Recently, a Pearson correlation method was proposed to mine word pairs from comparable corpora (Tao and Zhai, 2005); this idea is similar to the method used in (Kay and Roscheisen, 1993) for sentence alignment.…”
Section: Previous Workmentioning
confidence: 96%
“…Comparable corpora have been studied extensively in the literature -e.g., (Fung, 1995;Rapp, 1995;Tanaka and Iwasaki, 1996;Franz et al, 1998;Ballesteros and Croft, 1998;Masuichi et al, 2000;Sadat et al, 2004), but transliteration in the context of comparable corpora has not been well addressed. The general idea of exploiting time correlations to acquire word translations from comparable corpora has been explored in several previous studies -e.g., (Fung, 1995;Rapp, 1995;Tanaka and Iwasaki, 1996).…”
Section: Previous Workmentioning
confidence: 99%
“…However, computational limitation hampered further extension of this method. In 1996, Tannaka and Iwasaki [9] demonstrated how to extract lexical translation candidates from non-aligned corpora using the similar idea. In 1999, this method was developed and improved by Rapp [10].…”
Section: ) Acquiring Translation From Non-parallel Corporamentioning
confidence: 99%
“…The general idea of exploiting frequency correlations to acquire word translations from comparable corpora has already been explored in several previous studies (e.g., [6,10,15]). However, none of them has adopted a direct comparison of frequency distributions of candidate words as we do; rather they tend to compute the associations between the words in the same language and then compare association patterns in two different languages.…”
Section: Introductionmentioning
confidence: 99%
“…While comparable corpora have been studied extensively in the existing literature (e.g., [6,10,15,5,2,8,13]), almost all existing work assumes some kind of bilingual dictionary or translation examples to start with. We study how to map words and documents from comparable bilingual corpora without requiring any additional linguistic resources such as a bilingual dictionary.…”
Section: Introductionmentioning
confidence: 99%