It is a known fact that we live in the computer age and that many devices in the world have access to the internet. So how secure are these devices? Is there any guarantee that user information is not accessed from intruder? After the concept of the Internet of Things came into our lives, many things such as seeing the food in our home refrigerator, connecting to the Internet from the car and, and video chatting from our smart watch entered our lives. The number of malicious software is also increasing with these new connections. Researchers are increasingly emphasizing the importance of network security and intensifying their studies. Data preprocessing is very important when designing a secure system. In this study, the importance of normalization and standardization in data preprocessing is examined to make machine learning approaches more successful for detecting attacks on IoT devices. The study was carried out in Logistic Regression, Decision Tree, and Stochastic Gradient Descent machine learning algorithms using the Bot-IoT dataset. Bot-IoT dataset is a popular dataset that is widely used in security studies on IoT devices. Normalization and standardization processes were applied to Bot-IoT dataset separately, so data preprocessing was performed, then selected machine learning algorithms were trained with these -normalized / standardized-datasets. As a result of the trainings made with machine learning algorithms, the values of Accuracy, Precision, Recall and F1 Score rates were examined. And as a result of the study, it was seen that the standardization increased the accuracy rate up to 99.96% in Logistic Regression.
Internet of Things (IoT) produces an enormous amount of data, which is used in all areas of our lives and increases the number of data on the Internet with each passing day. Smart watches, robot vacuum cleaners, refrigerators with cameras, and more can all be considered IoT devices. Ease of access to the Internet provides people with advantages as well as disadvantages. Malware and intruders have easier access to the devices we use and our information via the internet. At this point, data security gains great importance especially in IoT devices because accessing our personal data via smart watches or refrigerators we use can pose a great threat to individuals and their families. This study focus the importance of data preprocessing and developing a hybrid machine learning-based intrusion detection system (IDS) for IoT. Decision Tree, which is a popular machine learning algorithm, and n_Balot dataset were preferred for investigations. Accordingly, it is aimed to create a hybrid model by applying K-means and Decision Tree algorithms to the n_Balot dataset with under sampling and feature selection. In the data preprocessing, feature selection was performed with Chi-Square method and under sampling performed with RandomOverSampling method. Then, clustering was done by applying K-means to the processed dataset, and the results obtained with the clustering were classified with the Decision tree algorithm. As a result of the study, while the error rate was 0.39% in the predictions made only with the decision tree, the error rate was reduced to 0.01% with the developed hybrid model.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.