As the number of Internet of Things (IoT) devices and applications increases, the capacity of the IoT access networks is considerably stressed. This can create significant performance bottlenecks in various layers of an end-to-end communication path, including the scheduling of the spectrum, the resource requirements for processing the IoT data at the Edge and/or Cloud, and the attainable delay for critical emergency scenarios. Thus, a proper classification or prediction of the time varying traffic characteristics of the IoT devices is required. However, this classification remains at large an open challenge. Most of the existing solutions are based on machine learning techniques, which nonetheless present high computational cost, whereas they are not considering the fine-grained flow characteristics of the traffic. To this end, this paper introduces the following four contributions. Firstly, we provide an extended feature set including, flow, packet and device level features to characterize the IoT devices in the context of a smart environment. Secondly, we propose a custom weighting based preprocessing algorithm to determine the importance of the data values. Thirdly, we present insights into traffic characteristics using feature selection and correlation mechanisms. Finally, we develop a two-stage learning algorithm and we demonstrate its ability to accurately categorize the IoT devices in two different datasets. The evaluation results show that the proposed learning framework achieves 99.9% accuracy for the first dataset and 99.8% accuracy for the second. Additionally, for the first dataset we achieve a precision and recall performance of 99.6% and 99.5%, while for the second dataset the precission and recall attained is of 99.6% and 99.7% respectively. These results show that our approach clearly outperforms other well-known machine learning methods. Hence, this work provides a useful model deployed in a realistic IoT scenario, where IoT traffic and devices' profiles are predicted and classified, while facilitating the data processing in the upper layers of an end-to-end communication model.
As the number of Internet of Things (IoT) devices and applications increases, the capacity of the IoT access networks is considerably stressed. This can create significant performance bottlenecks in various layers of an end-to-end communication path, including the scheduling of the spectrum, the resource requirements for processing the IoT data at the Edge and/or Cloud, and the attainable delay for critical emergency scenarios. Thus, it is required to classify or predict the time varying traffic characteristics of the IoT devices. However, this classification remains at large an open challenge. Most of the existing solutions are based on machine learning techniques, which nonetheless present high computational cost while non considering the fine-grained flow characteristics. To this end, in this paper we design a two-stage classification framework that utilizes both the network and statistical features to characterize the IoT devices in the context of a smart city. We firstly perform the data cleaning and preprocessing of the data and then analyze the dataset to extract the network and statistical features set for different types of IoT devices. The evaluation results show that the proposed classification can achieve 99% accuracy as compared to other techniques with Mathews Correlation Coefficient of 0.96.
The success of Internet of Things (IoT) has significantly increased the volume of data generated by various smart applications. However, as many of these applications are characterized by strict Quality of Service (QoS) requirements, there is a growing need for accurately predicting typical performance parameters such as throughput. This prediction should be based on the applications' traffic profiles and at the same time reflect the network uncertainty that IoT access networks add to the overall communication. In this work, we deployed 6 different smart building applications in a real testbed while creating a considerable traffic contention in an IEEE 802.15.4 access network. After preprocessing the raw data and following a feature engineering mechanism, we apply five different regression learning approaches to each application and predict its throughput. By resorting to several prediction error metrics and time metrics such as training and inference time, we show that the multiple linear regression achieves high accuracy while outperforming other well known machine learning methods.
Internet of Things (IoT) along with the advances in the recently emerged Edge Computing environment, have allowed the introduction of new and very diverse applications that can facilitate our everyday life. However, one intrinsic characteristic of IoT is the heterogeneity of the IoT devices that are continuously connected and disconnected, creating a highly volatile communication environment. In addition to that, new types of IoT devices are constantly manufactured making a supervised categorization approach not applicable due to the lack of historical data. Nonetheless, the classification or type identification of the IoT devices is important for the management and the decision making of the IoT applications, and can be used for traffic characterization, density prediction, network planning and security reasons among others. Accordingly, in this paper we propose for the first time an unsupervised machine learning methodology for the IoT device categorization that leverages traffic characteristics obtained at the network level. To this end, we tackle the limitation of requiring an annotated dataset, while our model could also work efficiently with new and not previously detected IoT devices. To do so, we experimentally evaluate our approach using two clustering algorithms namely, the K-Means and the BIRCH in a real dataset. The experimental evaluation presents promising results that enhance the applicability of unsupervised approaches for the IoT device categorization problem.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.