2007
DOI: 10.1541/ieejeiss.127.2077
|View full text |Cite
|
Sign up to set email alerts
|

Text Classification by Combining Different Distance Functions with Weights

Abstract: The text classification is an important subject in the data mining.For the text classification, several methods have been developed up to now, as the nearest neighbor analysis, the latent semantic analysis etc.The k-nearest neighbor (kNN) classification is a well-known simple and effective method for the classification of data in many domains. In the use of the kNN, the distance function is important to measure the distance and similarity between data. To improve the performance of the classifier by the kNN, a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2008
2008
2017
2017

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(2 citation statements)
references
References 8 publications
0
2
0
Order By: Relevance
“…Here we use two dimensions to produce a figure that is easy to display and interpret, acknowledging that there will necessarily be some error in the relative location of the text terms. KH Coder supports a number of distance measures and dimensional reduction techniques -here we use the Cosine distance measure [26] in combination with the Sammon method for dimensional reduction [27] in one case, and the Euclidean distance measure [28] in combination with the classical method for dimensional reduction [29] in the other case. Based on specifying the minimum frequency of occurrence of a term for inclusion in the MDS analysis and visualisation, terms appear as circles/bubbles in the plot, and it is possible to configure the plot to indicate the relative frequency of terms by the relative size of their bubble.…”
Section: Kh Coder Visualisation Methodsmentioning
confidence: 99%
“…Here we use two dimensions to produce a figure that is easy to display and interpret, acknowledging that there will necessarily be some error in the relative location of the text terms. KH Coder supports a number of distance measures and dimensional reduction techniques -here we use the Cosine distance measure [26] in combination with the Sammon method for dimensional reduction [27] in one case, and the Euclidean distance measure [28] in combination with the classical method for dimensional reduction [29] in the other case. Based on specifying the minimum frequency of occurrence of a term for inclusion in the MDS analysis and visualisation, terms appear as circles/bubbles in the plot, and it is possible to configure the plot to indicate the relative frequency of terms by the relative size of their bubble.…”
Section: Kh Coder Visualisation Methodsmentioning
confidence: 99%
“…Hence, several methods are introduced to improve the traditional NN. Some previous works have developed suitable distance metrics [2,3]. Adaptive metric method by Domeniconi et al [4] and flexible metric method by Friedman [5] are two proposed methods in this field.…”
Section: Introductionmentioning
confidence: 99%