The proliferation of Internet of Things (IoT) markets in the last decade introduces new challenges for network traffic analysis, and processing packet flows to identify IoT devices. This type of device suffers from scarcity, making them vulnerable to spoofing operations. In such circumstances, the device can be recognized by identifying its fingerprint. In this paper, a novel idea to elicit Device FingerPrint (DFP) is presented by extracting 30 features from the collected traffic packets of 19 IoT devices during setup and startup operations. Raspberry Pi 3 Model B+ is configured as an access point to collect and analyze the traffic of seven networked IoT devices using Wireshark Network Protocol Analyzer. Moreover, the rest of IoT devices traffic is taken from the publicly available network traffic dataset. Each IoT device's feature extraction process starts from getting Extensible Authentication Protocol over LAN (EAPOL) protocol, continuing with the other flowed protocols until the first session of Transmission Control Protocol (TCP) related to that device is closed. Depending on some produced variation of device traffic features, 20 fingerprints for each device are created. The probability theorem of Gaussian Naive Bayes (GNB) supervised machine learning is utilized to identify fingerprints of individual known devices and isolate the unknown ones. The performance evaluation for the proposed technique was calculated based on two measures, F1-score and identification accuracy. The average F1 score was around 0.99, while the overall identification accuracy rate was 98.35%.
The dramatic growth of Internet of Things (IoT) devices in recent years increases the IoT networks’ vulnerabilities and introduces new challenges among machine learning (ML) algorithms to detect the networked devices. The creation of a Device Fingerprint (DFP) may depend on extracting the network traffic features related to the device except for the identities assigned to it. In this paper, Device Fingerprints for 20 IoT devices are created by extracting 30 features during startup operation. Wireshark Network Protocol Analyzer is used to collect network traffic of 8 home IoT devices, meanwhile the traffics of the remaining devices are taken from the captures_IoT-Sentinel publicly available dataset. Four supervised machine learning algorithms were applied and tested to detect authorized devices and isolate unknown devices, namely: Support Vector Machine (SVM), Decision Tree (DT), Ensemble Random Forest (RF), and Gradient Boosting Classifier (GBC). Random Forest model and Gradient Boosting Classifier both showed better results of about 98.8% as an average of overall accuracy with less difference comparing with the accuracy of Decision Tree. Voting classifier was applied using the three estimators that resulted in high accuracy (DT, RF, and GBC) and achieving 99.5% as an average of overall accuracy.
The pervasive availability of the Internet of Things (IoT) markets lures targets for cyber-attacks since most manufactured IoT devices are usually resource-constrained devices. The first powerful line of IoT network protection from these vulnerabilities is detecting IoT devices especially the unauthorized ones by utilizing machine learning (ML) algorithms. Actually, it is so difficult or even impossible to find individual unknown IoT devices during the setup phase but, knowing their manufacturers is a matter to be deliberate. In this paper, a new method based fingerprints generation is introduced to detect the connected devices in the setup phase. Fingerprints for 21 different IoT devices are generated using devices’ network traffic. The whole produced fingerprints of devices are divided into four groups according to their manufacturers or fingerprints similarity proportion. Gradient Boosting Algorithm is applied to achieve the identified purposes. The proposed method is considered as a preparatory study for early detection of unauthorized. The performance evaluation for the proposed method was calculated based on two metrics: Identification accuracy and F1-score. The average identification accuracy rate was around 98.65%, while the average F1-score was about 99%.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.