Evaluation of k-nearest neighbour classifier performance for heterogeneous data sets

Ali, Najat; Neagu, Daniel; Trundle, Paul R.

doi:10.1007/s42452-019-1356-9

Cited by 190 publications

(108 citation statements)

References 31 publications

Supporting

Mentioning

104

Contrasting

Unclassified

Order By: Relevance

“…KNN is a supervised learning model that is considered to be one of the simplest ML models available [ 13 ]. KNN is referred to as a lazy learner because there is no training done with KNN; instead, the training data are used when making predictions to classify the data [ 13 ]. KNN operates under the assumption that similar data points will group and finds the closest data points using the K value, which can be set to any number [ 14 ].…”

Section: Background and Related Workmentioning

confidence: 99%

“…The second concept involves only using small subset of the features when splitting the nodes in the trees [ 23 ]. This is done to prevent overfitting when the model uses the training data to inflate the predictions made by the model [ 13 ]. When making predictions with RF, the average of each of the trees predictions is used to determine the overall class of the data; this process is called bootstrap aggregating [ 13 ].…”

Section: Background and Related Workmentioning

confidence: 99%

See 1 more Smart Citation

An Experimental Analysis of Attack Classification Using Machine Learning in IoT Networks

Churcher

Ullah

Ahmad

et al. 2021

Sensors

146

View full text Add to dashboard Cite

In recent years, there has been a massive increase in the amount of Internet of Things (IoT) devices as well as the data generated by such devices. The participating devices in IoT networks can be problematic due to their resource-constrained nature, and integrating security on these devices is often overlooked. This has resulted in attackers having an increased incentive to target IoT devices. As the number of attacks possible on a network increases, it becomes more difficult for traditional intrusion detection systems (IDS) to cope with these attacks efficiently. In this paper, we highlight several machine learning (ML) methods such as k-nearest neighbour (KNN), support vector machine (SVM), decision tree (DT), naive Bayes (NB), random forest (RF), artificial neural network (ANN), and logistic regression (LR) that can be used in IDS. In this work, ML algorithms are compared for both binary and multi-class classification on Bot-IoT dataset. Based on several parameters such as accuracy, precision, recall, F1 score, and log loss, we experimentally compared the aforementioned ML algorithms. In the case of HTTP distributed denial-of-service (DDoS) attack, the accuracy of RF is 99%. Furthermore, other simulation results-based precision, recall, F1 score, and log loss metric reveal that RF outperforms on all types of attacks in binary classification. However, in multi-class classification, KNN outperforms other ML algorithms with an accuracy of 99%, which is 4% higher than RF.

show abstract

Section: Background and Related Workmentioning

confidence: 99%

Section: Background and Related Workmentioning

confidence: 99%

An Experimental Analysis of Attack Classification Using Machine Learning in IoT Networks

Churcher

Ullah

Ahmad

et al. 2021

Sensors

146

View full text Add to dashboard Cite

show abstract

“…KNN is one of the most popular classification techniques based on distance algorithms. KNN is based on measuring the distances between the training samples and test samples to determine the final classification output [ 36 ].…”

Section: Chemometricsmentioning

confidence: 99%

The Application of Molecular Spectroscopy in Combination with Chemometrics for Halal Authentication Analysis: A Review

Rohman

Windarsih

2020

IJMS

View full text Add to dashboard Cite

Halal is an Arabic term used to describe any components allowed to be used in any products by Muslim communities. Halal food and halal pharmaceuticals are any food and pharmaceuticals which are safe and allowed to be consumed according to Islamic law (Shariah). Currently, in line with halal awareness, some Muslim countries such as Indonesia, Malaysia, and Middle East regions have developed some standards and regulations on halal products and halal certification. Among non-halal components, the presence of pig derivatives (lard, pork, and porcine gelatin) along with other non-halal meats (rat meat, wild boar meat, and dog meat) is typically found in food and pharmaceutical products. This review updates the recent application of molecular spectroscopy, including ultraviolet-visible, infrared, Raman, and nuclear magnetic resonance (NMR) spectroscopies, in combination with chemometrics of multivariate analysis, for analysis of non-halal components in food and pharmaceutical products. The combination of molecular spectroscopic-based techniques and chemometrics offers fast and reliable methods for screening the presence of non-halal components of pig derivatives and non-halal meats in food and pharmaceutical products.

show abstract

“…K-Nearest Neighbour classifier is designed to function based on measuring the distance. It attempts to classify numeral data records after finding the K-Nearest neighbor via measuring the distance between the training samples and the test samples according to Euclidian [33]. In this approach, the output is class-labled and K is usually a positive and integer digit of its neighbors.…”

Section: K-nearest Neighbors (Knn)mentioning

confidence: 99%

Classification techniques’ performance evaluation for facial expression recognition

Mahmood

Abdulrazaq

Zeebaree

et al. 2020

IJEECS

View full text Add to dashboard Cite

<p><span>Facial exprestion recognition as a recently developed method in computer vision is founded upon the idea of analazing the facial changes in which are witnessed due to emotional impacts on an individual. This paper provides a performance evaluation of a set of supervised classifiers used for facial expression recognition based on minimum features selected by chi-square. These features are the most iconic and influential ones that have tangible value for result dermination. The highest ranked six features are applied on six classifiers including multi-layer preceptron, support vector machine, decision tree, random forest, radial baised function, and k-nearest neioughbor to figure out the most accurate one when the minum number of features are utilized. This is done via analyzing and appraising the classifiers’ performance. CK+ is used as the research’s dataset. Random forest with the total accuracy ratio of 94.23 % is illustrated as the most accurate classifier amongst the rest. </span></p>

show abstract

Evaluation of k-nearest neighbour classifier performance for heterogeneous data sets

Cited by 190 publications

References 31 publications

An Experimental Analysis of Attack Classification Using Machine Learning in IoT Networks

An Experimental Analysis of Attack Classification Using Machine Learning in IoT Networks

The Application of Molecular Spectroscopy in Combination with Chemometrics for Halal Authentication Analysis: A Review

Classification techniques’ performance evaluation for facial expression recognition

Contact Info

Product

Resources

About