2019
DOI: 10.1007/s42452-019-1356-9
|View full text |Cite
|
Sign up to set email alerts
|

Evaluation of k-nearest neighbour classifier performance for heterogeneous data sets

Abstract: Distance-based algorithms are widely used for data classification problems. The k-nearest neighbour classification (k-NN) is one of the most popular distance-based algorithms. This classification is based on measuring the distances between the test sample and the training samples to determine the final classification output. The traditional k-NN classifier works naturally with numerical data. The main objective of this paper is to investigate the performance of k-NN on heterogeneous datasets, where data can be… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
104
0
4

Year Published

2020
2020
2024
2024

Publication Types

Select...
7
1
1

Relationship

0
9

Authors

Journals

citations
Cited by 190 publications
(108 citation statements)
references
References 31 publications
0
104
0
4
Order By: Relevance
“…KNN is a supervised learning model that is considered to be one of the simplest ML models available [ 13 ]. KNN is referred to as a lazy learner because there is no training done with KNN; instead, the training data are used when making predictions to classify the data [ 13 ]. KNN operates under the assumption that similar data points will group and finds the closest data points using the K value, which can be set to any number [ 14 ].…”
Section: Background and Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…KNN is a supervised learning model that is considered to be one of the simplest ML models available [ 13 ]. KNN is referred to as a lazy learner because there is no training done with KNN; instead, the training data are used when making predictions to classify the data [ 13 ]. KNN operates under the assumption that similar data points will group and finds the closest data points using the K value, which can be set to any number [ 14 ].…”
Section: Background and Related Workmentioning
confidence: 99%
“…The second concept involves only using small subset of the features when splitting the nodes in the trees [ 23 ]. This is done to prevent overfitting when the model uses the training data to inflate the predictions made by the model [ 13 ]. When making predictions with RF, the average of each of the trees predictions is used to determine the overall class of the data; this process is called bootstrap aggregating [ 13 ].…”
Section: Background and Related Workmentioning
confidence: 99%
“…KNN is one of the most popular classification techniques based on distance algorithms. KNN is based on measuring the distances between the training samples and test samples to determine the final classification output [ 36 ].…”
Section: Chemometricsmentioning
confidence: 99%
“…K-Nearest Neighbour classifier is designed to function based on measuring the distance. It attempts to classify numeral data records after finding the K-Nearest neighbor via measuring the distance between the training samples and the test samples according to Euclidian [33]. In this approach, the output is class-labled and K is usually a positive and integer digit of its neighbors.…”
Section: K-nearest Neighbors (Knn)mentioning
confidence: 99%