2020
DOI: 10.48550/arxiv.2010.03856
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Transcending Transcend: Revisiting Malware Classification in the Presence of Concept Drift

Abstract: Machine learning for malware classification shows encouraging results, but real deployments suffer from performance degradation as malware authors adapt their techniques to evade detection. This phenomenon, known as concept drift, occurs as new malware examples evolve and become less and less like the original training examples. One promising method to cope with concept drift is classification with rejection in which examples that are likely to be misclassified are instead quarantined until they can be expertl… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
4
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
1
1

Relationship

1
1

Authors

Journals

citations
Cited by 2 publications
(4 citation statements)
references
References 18 publications
0
4
0
Order By: Relevance
“…Many learning-based systems in security are evaluated solely in laboratory settings, overstating their practical impact. A common example are detection methods evaluated only in a closed-world setting with limited diversity and no consideration of non-stationarity [15,70]. For example, a large number of website fingerprinting attacks are evaluated only in closed-world settings spanning a limited time period [71].…”
Section: % Presentmentioning
confidence: 99%
See 2 more Smart Citations
“…Many learning-based systems in security are evaluated solely in laboratory settings, overstating their practical impact. A common example are detection methods evaluated only in a closed-world setting with limited diversity and no consideration of non-stationarity [15,70]. For example, a large number of website fingerprinting attacks are evaluated only in closed-world settings spanning a limited time period [71].…”
Section: % Presentmentioning
confidence: 99%
“…However, public datasets need to be treated with caution. Firstly, data ages and becomes less relevant in the fast-moving security landscape, partially due to concept drift [15,70,85,101]. Secondly, the characteristics of the data are increasingly exposed and thereby lead to implicit data snooping (P3) [see 1,88].…”
Section: Data Collection and Labelingmentioning
confidence: 99%
See 1 more Smart Citation
“…For example, Jordaney et al [54] proposed the Transced framework to identify concept drift to establish prediction indicators. Barbero et al [55] based on the former framework for performing rejection classification, has improved efficiency and reduced computing expenses. For an ML-based classifier to be highly sustainable, it is critical to understand the underlying features: the ability to distinguish benign applications from malware and extract the changing pattern of those features through evolutionary processes [56].…”
mentioning
confidence: 99%