2015
DOI: 10.1007/s10994-015-5495-y
|View full text |Cite
|
Sign up to set email alerts
|

On some transformations of high dimension, low sample size data for nearest neighbor classification

Abstract: For data with more variables than the sample size, phenomena like concentration of pairwise distances, violation of cluster assumptions and presence of hubness often have adverse effects on the performance of the classic nearest neighbor classifier. To cope with such problems, some dimension reduction techniques like those based on random linear projections and principal component directions have been proposed in the literature. In this article, we construct nonlinear transformations of the data based on inter… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
13
0
1

Year Published

2018
2018
2023
2023

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 17 publications
(14 citation statements)
references
References 27 publications
(33 reference statements)
0
13
0
1
Order By: Relevance
“…Beyer et al (1999) show that distance concentration can occur with as few as 15 dimensions. See Dutta & Ghosh (2016) and Hall et al (2005).…”
Section: Comparison Of Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Beyer et al (1999) show that distance concentration can occur with as few as 15 dimensions. See Dutta & Ghosh (2016) and Hall et al (2005).…”
Section: Comparison Of Methodsmentioning
confidence: 99%
“…In Table , we can see that increasing the number of trials increases the discriminant power in all classifiers. Dutta & Ghosh () show the adverse effects of high dimensions on the performance of classic NN classifier. This is also seen in Table .…”
Section: Simulationmentioning
confidence: 99%
“…Let Z be a new observation to be classified. Dutta and Ghosh 22 illustrate a transformation based on the IPDs to classify Z . For N x , N y ≥2, these transformed data points are given as follows: boldXi=()false|false|boldXiprefix−boldX1false|false|d,0.3em,false|false|boldXiprefix−boldXNxfalse|false|d,false|false|boldXiprefix−boldY1false|false|d,0.3em,false|false|boldXiprefix−boldYNyfalse|false|d,2emboldYj=()false|false|boldYjprefix−boldX1false|false|d,0.3em,false|false|boldYjprefix−boldXNxfalse|false|d,false|false|boldYjprefix−boldY1false|false|d,0.3em,false|false|boldYjprefix−boldYNyfalse|false|d.2em …”
Section: Applicationsunclassified
“…When a Big Data problem is presented as a domain with a large number of characteristics, dimensionality reduction approaches may be needed Dutta and Ghosh () to accelerate distance compensations in nearest neighbors classification. The Locality‐sensitive hashing (LSH) Andoni and Indyk () algorithm is a well‐known example that reduces the dimensionality of the data using hash functions with the particularity of looking for a collision between instances that are similar.…”
Section: The K‐nn Algorithm In Big Data: Current and Future Trendsmentioning
confidence: 99%