2003
DOI: 10.1007/978-3-540-39907-0_28
|View full text |Cite
|
Sign up to set email alerts
|

Learning-Free Text Categorization

Abstract: Abstract. In this paper, we report on the fusion of simple retrieval strategies with thesaural resources in order to perform large-scale text categorization tasks. Unlike most related systems, which rely on training data in order to infer text-to-concept relationships, our approach can be applied with any controlled vocabulary and does not use any training data. The first classification module uses a traditional vector-space retrieval engine, which has been fine-tuned for the task, while the second classifier … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0
1

Year Published

2006
2006
2015
2015

Publication Types

Select...
5
2
1

Relationship

1
7

Authors

Journals

citations
Cited by 12 publications
(8 citation statements)
references
References 17 publications
0
7
0
1
Order By: Relevance
“…Although the automatic assignment of MeSH indexing terms to a body of biomedical text has been extensively studied in the literature (see for example [1], [5], [6], [7], [8], [9], [10]), several major aspects of the task are often misunderstood or understated. Most issues pertain to the following topics:

multi-label assignment

scalability

compliance with indexing policies

…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Although the automatic assignment of MeSH indexing terms to a body of biomedical text has been extensively studied in the literature (see for example [1], [5], [6], [7], [8], [9], [10]), several major aspects of the task are often misunderstood or understated. Most issues pertain to the following topics:

multi-label assignment

scalability

compliance with indexing policies

…”
Section: Introductionmentioning
confidence: 99%
“…Note that in [6], MeSH main headings are referred to as “MeSH categories”. Most efforts addressing MeSH indexing attempt to tackle indexing by solely using main headings which involves about 24,000 categories [1], [7]. However, in practice, MeSH indexing terms also include main heading/subheading pairs.…”
Section: Introductionmentioning
confidence: 99%
“…For a short introduction on automatic text categorization in MEDLINE, the reader is referred to the NLM's indexing initiative [ 9 ]; for a detailed presentation of our vector space engine and a comparison with state-of-the-art systems, including NLM's tools, see [ 3 ](in this joint evaluation between four retrieval systems, our engine showed competitive performances) [ 10 ]. For a complete overview and evaluation of our categorization system applied on Medical Subject Headings and on the Gene Ontology, see [ 11 ].…”
Section: Methodsmentioning
confidence: 99%
“…Il faut signaler que l'équipe CISMeF a réalisé en 2007 une évaluation de l'indexation automatique [26] [29] et NomIndex [14], cela en utilisant un corpus en français « misc » et les ressources en français du corpus « ENFR » [30]. Pour un rang égal à 10 (ce rang représente le nombre de mots clés classés selon un score calculé), les valeurs de rappel Cette comparaison entre les méthodes d'indexation automatique reste approximative, car les corpus de test sont différents et indexés manuellement par des indexeurs différents.…”
Section: Catégories Des Erreurs Fréquences Relatives Des Catégories Dunclassified