Proceedings of the 15th International Conference on Availability, Reliability and Security 2020
DOI: 10.1145/3407023.3409190
|View full text |Cite
|
Sign up to set email alerts
|

Making use of NXt to nothing

Abstract: Numerous machine learning classifiers have been proposed for binary classification of domain names as either benign or malicious, and even for multiclass classification to identify the domain generation algorithm (DGA) that generated a specific domain name. Both classification tasks have to deal with the class imbalance problem of strongly varying amounts of training samples per DGA. Currently, it is unclear whether the inclusion of DGAs for which only a few samples are known to the training sets is beneficial… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
3
1
1

Relationship

1
4

Authors

Journals

citations
Cited by 13 publications
(2 citation statements)
references
References 17 publications
0
2
0
Order By: Relevance
“…We use the same evaluation metrics as in the original papers: accuracy 1 We intentionally include underrepresented classes because the inclusion of a few training samples per class allows a classifier to detect various underrepresented DGAs with high probability that would otherwise be missed. At the same time, this does not affect a classifier's ability to recognize well-represented classes [15]. We present the averaged results of the four-fold cross validation in Table 1.…”
Section: State-of-the-art Results Reproductionmentioning
confidence: 99%
See 1 more Smart Citation
“…We use the same evaluation metrics as in the original papers: accuracy 1 We intentionally include underrepresented classes because the inclusion of a few training samples per class allows a classifier to detect various underrepresented DGAs with high probability that would otherwise be missed. At the same time, this does not affect a classifier's ability to recognize well-represented classes [15]. We present the averaged results of the four-fold cross validation in Table 1.…”
Section: State-of-the-art Results Reproductionmentioning
confidence: 99%
“…It is important to understand the applied domain name preprocessing as this step can introduce significant classification biases. The works (e.g., [13][14][15]34]) that operate on single NXDs for classification make the data used unique and filter all benign samples against OSINT feeds to remove potentially contained malicious domains before training and testing a classifier. Other than that, they do not apply any filtering to the benign-labeled data used, since it is captured from real-world networks.…”
Section: Preprocessingmentioning
confidence: 99%