2006
DOI: 10.1002/asi.20326
|View full text |Cite
|
Sign up to set email alerts
|

Conceptual analysis of parallel corpus collected from the Web

Abstract: As illustrated by the World Wide Web, the volume of information in languages other than English has grown significantly in recent years. This highlights the importance of multilingual corpora. Much effort has been devoted to the compilation of multilingual corpora for the purpose of cross-lingual information retrieval and machine translation. Existing parallel corpora mostly involve European languages, such as English-French and English-Spanish. There is still a lack of parallel corpora between European langua… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2006
2006
2010
2010

Publication Types

Select...
6

Relationship

0
6

Authors

Journals

citations
Cited by 7 publications
(3 citation statements)
references
References 20 publications
(18 reference statements)
0
3
0
Order By: Relevance
“…However, as ICD-9 codes are primarily used for billing purposes, they are not always informative for syndromic surveillance [18,19]. As such, freetext CCs remain one of the most important data sources for syndromic surveillance [20].…”
Section: Non-english Chief Complaint Classification Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…However, as ICD-9 codes are primarily used for billing purposes, they are not always informative for syndromic surveillance [18,19]. As such, freetext CCs remain one of the most important data sources for syndromic surveillance [20].…”
Section: Non-english Chief Complaint Classification Methodsmentioning
confidence: 99%
“…The corpus-based approach [20,[25][26][27] analyzes large document collections (parallel or comparable corpora) to construct a statistical translation model. It has the potential to translate emerging terminologies.…”
Section: Major Cross-lingual Information Retrieval Approachesmentioning
confidence: 99%
“…However, the use of the web as a bilingual or multilingual corpus for language studies, for natural language processing and for cross-language information retrieval (Chen & Nie, 2000;Resnik, 2003;Sigurbjörnsson, Kamps & Rijke, 2005;Chesñevar, Sabaté & Maguitman, 2006;Li & Yang, 2006;Wang et al, 2006) is only in its infant stage. The relatively recent exploration of the web as a bilingual or multi-lingual corpus was made possible by the rapid growth in the number of web pages, and the availability of vast quantities of web-based translation texts involving many language pairs.…”
Section: Introductionmentioning
confidence: 99%