Proceedings of the 20th ACM International Conference on Information and Knowledge Management 2011
DOI: 10.1145/2063576.2063919
|View full text |Cite
|
Sign up to set email alerts
|

A probabilistic approach to nearest-neighbor classification

Abstract: Most machine-learning tasks, including classification, involve dealing with high-dimensional data. It was recently shown that the phenomenon of hubness, inherent to high-dimensional data, can be exploited to improve methods based on nearest neighbors (NNs). Hubness refers to the emergence of points (hubs) that appear among the k NNs of many other points in the data, and constitute influential points for kNN classification. In this paper, we present a new probabilistic approach to kNN classification, naive hubn… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
45
0

Year Published

2014
2014
2021
2021

Publication Types

Select...
5
1

Relationship

1
5

Authors

Journals

citations
Cited by 39 publications
(45 citation statements)
references
References 12 publications
0
45
0
Order By: Relevance
“…classifier [23] and the self-training semi-supervised learning technique in Section 2.1 and Section 2.2. The presentation of NHBNN and self-training is based on [21] and [11] respectively.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…classifier [23] and the self-training semi-supervised learning technique in Section 2.1 and Section 2.2. The presentation of NHBNN and self-training is based on [21] and [11] respectively.…”
Section: Methodsmentioning
confidence: 99%
“…In particular, our approach is an extension of the Naive Hubness Bayesian k-Nearest Neighbor, or NHBNN for short [23], which is one of the most promising hubness-aware classifiers. As we will show, straightforward incorporation of semi-supervised classification techniques with NHBNN leads to suboptimal results, therefore, we develop a hubness-aware inductive semi-supervised classification scheme.…”
Section: Introductionmentioning
confidence: 99%
“…Informally, this estimate can be interpreted as follows: we consider m additional pseudo-instances from each class and we assume that x i appears as one of the k-nearest neighbors of the pseudo-instances from class C. We use m = 1 in our experiments. Even though k-occurrences are highly correlated, as shown in [19] and [21], NHBNN offers improvement over the basic kNN. This is in accordance with other results from the literature that state that Naive Bayes can deliver good results even in cases with high independence assumption violation [15].…”
Section: Nhbnn: Naive Hubness Bayesian K-nearest Neighbormentioning
confidence: 99%
“…In the context of k-nearest neighbor classification, bad hubs were shown to be responsible for a surprisingly large portion of the total classification error. Therefore, hubness-aware classifiers were developed, such as the Naive Hubness Bayesian k-Nearest Neighbor, or NHBNN for short [21].…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation