2005
DOI: 10.1007/11564126_21
|View full text |Cite
|
Sign up to set email alerts
|

Word Sense Disambiguation for Exploiting Hierarchical Thesauri in Text Classification

Abstract: Abstract. The introduction of hierarchical thesauri (HT) that contain significant semantic information, has led researchers to investigate their potential for improving performance of the text classification task, extending the traditional "bag of words" representation, incorporating syntactic and semantic relationships among words. In this paper we address this problem by proposing a Word Sense Disambiguation (WSD) approach based on the intuition that word proximity in the document implies proximity also in t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

3
60
0

Year Published

2006
2006
2019
2019

Publication Types

Select...
6
1
1

Relationship

3
5

Authors

Journals

citations
Cited by 43 publications
(63 citation statements)
references
References 11 publications
3
60
0
Order By: Relevance
“…Semantic-aware kernels have been proposed by Mavroeidis et al [4] who propose a generalized vector space model with WordNet senses and their hypernyms to improve text classification performance. Bloehdorn at al.…”
Section: Semantics In Text Mining and Information Retrievalmentioning
confidence: 99%
See 2 more Smart Citations
“…Semantic-aware kernels have been proposed by Mavroeidis et al [4] who propose a generalized vector space model with WordNet senses and their hypernyms to improve text classification performance. Bloehdorn at al.…”
Section: Semantics In Text Mining and Information Retrievalmentioning
confidence: 99%
“…This latter definition of SR for a pair of terms is the definition of the Omiotis measure that we are using in our case. 4 4 Omiotis-based Semantic Kernel…”
Section: Semantic Relatedness and The Omiotis Measurementioning
confidence: 99%
See 1 more Smart Citation
“…Siolas and d'Alché Buc (2000) pioneered the idea of semantic kernels for text categorization, capitalizing on WordNet (Miller, 1995) to propose continuous word kernels based on the inverse of the path lengths in the tree rather than the common delta word kernel used so far, i. e. exact matching between unigrams. Bloehdorn et al (2006) extended it later to other tree-based similarity measures from WordNet while Mavroeidis et al (2005) exploited its hierarchical structure to define a Generalized Vector Space Model kernel.…”
Section: Introductionmentioning
confidence: 99%
“…Whereas we regard document categorization by SVM [30,50,49,4,38,6] a particular implementation of machine learning, an increasingly successful solution to the classical problem of automatic classification, we also envisage information representation by vectors, a standard point of departure for TC by SVM, a limitation of the above attempt, and combine the former with semantic content representation in Hilbert space instead of Euclidean space. In this new approach, instead of term and document vectors, term and document functions are used to represent the semantic content of digital objects, with the advantage that functions, having more parameters than vectors, can host more semantic content in a comprehensive description than vector space based methods.…”
Section: Introductionmentioning
confidence: 99%