2020
DOI: 10.1007/978-3-030-50417-5_35
|View full text |Cite
|
Sign up to set email alerts
|

Malicious Domain Detection Based on K-means and SMOTE

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
6
0
1

Year Published

2020
2020
2022
2022

Publication Types

Select...
5
2
2

Relationship

0
9

Authors

Journals

citations
Cited by 17 publications
(7 citation statements)
references
References 14 publications
0
6
0
1
Order By: Relevance
“…Trend pola jaringan yang diakses pengguna internet serta menghasilkan profil trafik jaringan pada volume lalu lintas data yang tinggi menghasilkan tiga klaster berdasarkan penggunaan trafik, yaitu tinggi, sedang dan rendah sesuai dengan protokol layanan dan IP Address yang berbeda [10]. Dalam mendeteksi domain berbahaya dengan proses klasterisasi terhadap sejumlah besar data trafik DNS dalam menemukan domain berbahaya juga tepat dengan menggunakan metode K-Mean [11]. Penelitian yang memanfaatkan DNS log dalam membuat klasterisasi penggunaan trafik internet sangat diperlukan dalam menjaga kelancaran trafik [12], [13] maka penelitian ini bertujuan untuk mengklasterisasi penggunaan trafik internet menggunakan K-Mean Clustering.…”
Section: Pendahuluanunclassified
“…Trend pola jaringan yang diakses pengguna internet serta menghasilkan profil trafik jaringan pada volume lalu lintas data yang tinggi menghasilkan tiga klaster berdasarkan penggunaan trafik, yaitu tinggi, sedang dan rendah sesuai dengan protokol layanan dan IP Address yang berbeda [10]. Dalam mendeteksi domain berbahaya dengan proses klasterisasi terhadap sejumlah besar data trafik DNS dalam menemukan domain berbahaya juga tepat dengan menggunakan metode K-Mean [11]. Penelitian yang memanfaatkan DNS log dalam membuat klasterisasi penggunaan trafik internet sangat diperlukan dalam menjaga kelancaran trafik [12], [13] maka penelitian ini bertujuan untuk mengklasterisasi penggunaan trafik internet menggunakan K-Mean Clustering.…”
Section: Pendahuluanunclassified
“…This model extracts static lexical features and dynamic DNS resolving features to profile every DN from the DNS traffic data. In [28], the authors also address the imbalance problem and present a KMSMOTE method that uses SMOTE and K-Means clustering algorithm. The system uses assumptions such as malicious DNs leave their traces on DNS traffic, malicious DNs have lower DN registration cost, and reuse network resources.…”
Section: Related Workmentioning
confidence: 99%
“…Focus on groundtruth labeling [24] --Heuristic [30], [47] 54K Based on intuition [25] --Heuristic [30], [41] 10M Slows down n/w [26] --Heuristic [30], [31], [44] Based on intuition [27] -DI EEA, [30], [39], 10K Focus only on data HAC EEA [46], [48] imbalance [28] -DI CatBoost, SVM [30], [43], [49], 16K Oversampling GBDT, XGBoost [34], [41], [50] [29] APT EI ELM, LR, SVM, [30], [41], 40K Only for CART, BPNN [34], [42] targeted attacks Ours Eth --K-Means, 11 [33], [49], [51], 335M -ML Algos [44], [52], using TPOT [42], [53] • B/C Blockchain, Eth Ethereum Blockchain data, • Features: N DN String based, Q DNS Query based, G DNS Graph based, T Temporal aspect based, O Other, particular feature not used, • Detection Of: AG Algorithmic Generated Names, BT Botnet, F F Fast-Flux, AP T Advance Persistent Threats, − no specific mention but targets DNs in general, • Tackles: R Reputation, L Ground Truth Labeling, DI Data Imbalance, EI Efficiency Improvement, − no specific mention, • ML Algo: GB Gradient Boosting, SV M Support Vector Machine, RF Random Forest, KN N K-Nearest Neighbors, N B Naive Bayes, BC Bayesian Classifier, LBS Logit-Boost Strategy, RC Random Committee, EEA EasyEnsemble Algorithm, ELM Extreme Learning Machine, GBDT Gradient Boosting Decision Tree, XGBoost eXtreme Gradient Boosting, LR Logistic Regression, BP N N Back Propagation Neural Networks, • Dataset Size: no mention, • Issues: DGA : Domain Generation Algorithm, RDN S Recursive DN System, n/w : network, IP : IP address ground truth information about the DNs is extracted from [33], [42],…”
Section: A Data Collection and Pre-processingmentioning
confidence: 99%
“…In addition, some researchers identify the characteristics of malicious domain names by analyzing DNS traffic data. Researchers use many detection methods; for example, K-means algorithm and smote method are combined [ 6 ], convolutional neural network structure and cyclic neural network are combined to detect malicious domain names involved in botnets [ 7 ], and RBF kernel is added to support vector machine algorithm to improve the detection effect of malicious domain names [ 8 ].…”
Section: Introductionmentioning
confidence: 99%