2019
DOI: 10.1109/access.2019.2917532
|View full text |Cite
|
Sign up to set email alerts
|

Semi-Supervised K-Means DDoS Detection Method Using Hybrid Feature Selection Algorithm

Abstract: Distributed denial of service (DDoS) attack is an attempt to make an online service unavailable by overwhelming it with traffic from multiple sources. Therefore, it is necessary to propose an effective method to detect DDoS attack from massive data traffics. However, the existing schemes have some limitations, including that supervised learning methods, need large numbers of labeled data and unsupervised learning algorithms have relatively low detection rate and high false positive rate. In order to tackle the… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
41
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 121 publications
(48 citation statements)
references
References 39 publications
(34 reference statements)
0
41
0
Order By: Relevance
“…e Smart Detection system has reached high accuracy and low false-positive rate. Experiments were conducted using two Virtual Linux boxes, Define all the descriptor database variables as the current variables; (5) while True do (6) Split dataset in training and test partitions; (7) Create and train the model using training data partition; (8) Select the most important variables from the trained model; (9) Calculate the cumulative importance of variables from the trained model; (10) if max (cumulative importance of variables) < Variable importance threshold then (11) Exit loop; (12) end (13) Train the model using only the most important variables; (14) Test the trained model and calculate the accuracy; (15) if Calculated accuracy < Accuracy threshold then (16) Exit loop; (17) end (18) Add current model to optimized model set; (19) Define the most important variables from the trained model as the current variables; (20) end (21) end (22) Group the models by number of variables; (23) Remove outliers from the grouped model set; (24) Select the group of models with the highest frequency and their number of variables "N"; (25) Rank the variables by the mean of the importance calculated in step 7; (26) Return the "N" most important variables; [2004][2005] have been used by the researchers to evaluate the performance of their proposed intrusion detection and prevention approaches. However, many such datasets are out of date and unreliable to use [25].…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…e Smart Detection system has reached high accuracy and low false-positive rate. Experiments were conducted using two Virtual Linux boxes, Define all the descriptor database variables as the current variables; (5) while True do (6) Split dataset in training and test partitions; (7) Create and train the model using training data partition; (8) Select the most important variables from the trained model; (9) Calculate the cumulative importance of variables from the trained model; (10) if max (cumulative importance of variables) < Variable importance threshold then (11) Exit loop; (12) end (13) Train the model using only the most important variables; (14) Test the trained model and calculate the accuracy; (15) if Calculated accuracy < Accuracy threshold then (16) Exit loop; (17) end (18) Add current model to optimized model set; (19) Define the most important variables from the trained model as the current variables; (20) end (21) end (22) Group the models by number of variables; (23) Remove outliers from the grouped model set; (24) Select the group of models with the highest frequency and their number of variables "N"; (25) Rank the variables by the mean of the importance calculated in step 7; (26) Return the "N" most important variables; [2004][2005] have been used by the researchers to evaluate the performance of their proposed intrusion detection and prevention approaches. However, many such datasets are out of date and unreliable to use [25].…”
Section: Resultsmentioning
confidence: 99%
“…is has led researchers to use autonomous solutions that can operate (detect and mitigate) based on the behavior and characteristics of the traffic. In this sense, the adoption of solutions with techniques based on artificial intelligence, mainly machine learning (ML), has been distinguished by offering high flexibility in the classification process, consequently improving the detection of malicious traffic [18,19]. e industrial sector offers DDoS protection as a service through large structures, usually operated by specialized providers [6] such as Akamai, Cloudflare, and Arbor Networks, which have large processing capacity and proprietary filtering mechanisms.…”
Section: Problem Statements Ddos Detection and Mitigationmentioning
confidence: 99%
“…As can be seen in the work of [16] and [17], the use of the K-means algorithm presented a very high hit rate compared to other techniques. This algorithm will run in the Cloud (Fig.…”
Section: Fog Computing Architecturementioning
confidence: 94%
“…As shown in the article of [16], it was proposed a DDoS detection model with K-Means algorithm customization that compared to other works provided a higher rate of detection of anomalies, taking into account factors such as True Positive Rate, False Positive Rate and Recall Rate. In addition, is used the main Open Source Dataset (DARPA, CAIDA, CICIDS), as well as the real-world dataset to proposed benchmark.…”
Section: B Work Related To Security Issuesmentioning
confidence: 99%
“…Furthermore, the adequacy of the result in the case of a complex cluster shapes is questionable (this model is proved to work fine with the ball-shaped clusters [6]). The result is sensitive to the outliers (standalone objects) [7,8] and depends on the chosen distance measure and the data normalization method. This model does not take into account the dissimilarity between the objects in different clusters, and the application of the k-means model results in some solution X 1 , .…”
Section: Problem Statementmentioning
confidence: 99%