2003
DOI: 10.1002/asi.10261
|View full text |Cite
|
Sign up to set email alerts
|

Automatic construction of English/Chinese parallel corpora

Abstract: As the demand for global information increases significantly, multilingual corpora has become a valuable linguistic resource for applications to cross-lingual information retrieval and natural language processing. In order to cross the boundaries that exist between different languages, dictionaries are the most typical tools. However, the general-purpose dictionary is less sensitive in both genre and domain. It is also impractical to manually construct tailored bilingual dictionaries or sophisticated multiling… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
29
0

Year Published

2005
2005
2012
2012

Publication Types

Select...
5
1

Relationship

3
3

Authors

Journals

citations
Cited by 54 publications
(29 citation statements)
references
References 32 publications
0
29
0
Order By: Relevance
“…Much of this work has been done on either finding parallel sentences from small corpora [28] or web pages [23,26,28,32]. Most of the work on finding web page has utilized structural information -HTML markup such anchors, links, filenames -to find [23,26] parallel resources.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Much of this work has been done on either finding parallel sentences from small corpora [28] or web pages [23,26,28,32]. Most of the work on finding web page has utilized structural information -HTML markup such anchors, links, filenames -to find [23,26] parallel resources.…”
Section: Related Workmentioning
confidence: 99%
“…Alignment was specifically rejected as being too expensive. [32] limited the alignment to titles and a translation dictionary to find parallel texts. Much of the machine translation work seems to be on the extraction of bilingual dictionaries [11] rather than finding document translations in large corpora.…”
Section: Related Workmentioning
confidence: 99%
“…In the case of Hong Kong, as a British colony for more than a century, there has been a bilingual culture. Consequently, the official languages are Chinese and English and many important documents are written in Chinese and English, using covert translation (Yang & Li, 2003). For example, most of the documents released by the government have both Chinese and English versions.…”
Section: Parallelism Chinese and English Parallel Documentsmentioning
confidence: 99%
“…In earlier related studies, it was determined that grammatical and lexical differences do have a significant effect on text processing (Yang & Li, 2003). For example, a word in one language can be translated into one or more words in another language.…”
Section: Introductionmentioning
confidence: 99%
“…Resnick (1999) addressed the issue of automatic language identification for acquiring parallel corpus of Web documents. Yang and Li (2003) tried to construct English-Chinese parallel corpora from the Web sites with monolingual subtree structure. Then, the parallel corpora could be used in automatic generation of English-Chinese thesaurus (Yang & Luk, 2003) as a promising tool for cross-lingual information retrieval.…”
Section: Related Workmentioning
confidence: 99%