2003
DOI: 10.1007/978-3-540-45175-4_13
|View full text |Cite
|
Sign up to set email alerts
|

Cross-Lingual Text Categorization

Abstract: Abstract. This article deals with the problem of Cross-Lingual Text Categorization (CLTC), which arises when documents in different languages must be classified according to the same classification tree. We describe practical and cost-effective solutions for automatic Cross-Lingual Text Categorization, both in case a sufficient number of training examples is available for each new language and in the case that for some language no training examples are available. Experimental results of the bi-lingual classifi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
59
0
1

Year Published

2009
2009
2023
2023

Publication Types

Select...
6
3

Relationship

0
9

Authors

Journals

citations
Cited by 103 publications
(62 citation statements)
references
References 15 publications
2
59
0
1
Order By: Relevance
“…Unfortunately, the above-mentioned problems regarding availability, accessibility, and performance still hold in this case. The effect of different translation strategies on CLTC has been investigated by Bel, Koster, and Villegas (2003), Rigutini, Maggini, and Liu (2005), and Wei, Lin, and Yang (2011).…”
Section: Exploiting External Multilingual Resourcesmentioning
confidence: 99%
“…Unfortunately, the above-mentioned problems regarding availability, accessibility, and performance still hold in this case. The effect of different translation strategies on CLTC has been investigated by Bel, Koster, and Villegas (2003), Rigutini, Maggini, and Liu (2005), and Wei, Lin, and Yang (2011).…”
Section: Exploiting External Multilingual Resourcesmentioning
confidence: 99%
“…But on the Wikipedia articles, JointLDA model achieves lower perplexity scores which indicate better predictability than a bag-of-word translation model. This leaves a possibility for JointLDA to be preferred over bag-of-word translation for applications like CLIR and Cross-lingual Text Categorization [2].…”
Section: Perplexity Of the Aligned Test Setmentioning
confidence: 99%
“…This situation raises the need for novel ways of organizing a multilingual corpus based on common topics/events, which could potentially be useful for many cross-lingual applications like Cross-Lingual Information Retrieval (CLIR) [1] and cross-lingual text classification [2]. Though there have been many attempts to mine the topical structure from a document corpus [3][4][5] most of these approaches operate in a monolingual scenario.…”
Section: Introductionmentioning
confidence: 99%
“…Bel et al [9] were amongst the early pioneers examining crosslingual text categorization. They used the Rocchio algorithm, a popular learning method based on relevance feedback, and the Winnow algorithm, a method for learning a linear classifier from labeled examples, to categorize documents in multiple languages.…”
Section: A] Machine Translation Techniquesmentioning
confidence: 99%