2023
DOI: 10.48550/arxiv.2302.04332
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Continuous Learning for Android Malware Detection

Abstract: Machine learning methods can detect Android malware with very high accuracy. However, these classifiers have an Achilles heel, concept drift: they rapidly become out of date and ineffective, due to the evolution of malware apps and benign apps. Our research finds that, after training an Android malware classifier on one year's worth of data, the F1 score quickly dropped from 0.99 to 0.76 after 6 months of deployment on new test samples.In this paper, we propose new methods to combat the concept drift problem o… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(5 citation statements)
references
References 28 publications
0
5
0
Order By: Relevance
“…Our current framework relies on the availability of ground truth labels for individual families every month. This is similar to Chen et al [13] where they assume a monthly labeling budget for active learning. Although this is an appropriate assumption for a post-hoc forensics framework, ground truth labels are scarce in real-world deployments.…”
Section: Limitations and Discussionmentioning
confidence: 53%
See 2 more Smart Citations
“…Our current framework relies on the availability of ground truth labels for individual families every month. This is similar to Chen et al [13] where they assume a monthly labeling budget for active learning. Although this is an appropriate assumption for a post-hoc forensics framework, ground truth labels are scarce in real-world deployments.…”
Section: Limitations and Discussionmentioning
confidence: 53%
“…In our settings, drift was mostly caused by a malware family (although which one or whether more were present is irrelevant to the problem at hand). This is either because of the actual evolution of malware behaviors, imprecise abstractions and representations of the datasets, or a combination thereof, which calls for further research on drift detection [11,13,25,35,47] and programs' representations.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…In addition to the considered metrics, a comparison of the False Positive Rate and Detection rate for the proposed work is also used to represent the efficacy of the proposed technique. A False Positive Rate (FPR) also known as Fall-Out is the measurement of accuracy and calculated with equation (17). False Positive Rate is defined as a ratio of false positive with a sum of false positive and true negative.…”
Section: Resultsmentioning
confidence: 99%
“…The behavioral drift is defined in terms of the development of new attack variants with changing behavior over time [15,16]. Behavioral drift is the basis for zero-day attacks [17], by changing the attack attributes (which are also called features) [6,14,18]. For developing variants, changing distribution of features changes the significance of features accordingly.…”
Section: Introductionmentioning
confidence: 99%