2022
DOI: 10.48550/arxiv.2203.12980
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

MERLIN -- Malware Evasion with Reinforcement LearnINg

Abstract: In addition to signature-based and heuristics-based detection techniques, machine learning (ML) is widely used to generalize to new, never-before-seen malicious software (malware). However, it has been demonstrated that ML models can be fooled by tricking the classifier into returning the incorrect label. These studies, for instance, usually rely on a prediction score that is fragile to gradient-based attacks. In the context of a more realistic situation where an attacker has very little information about the … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
1
1
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(5 citation statements)
references
References 14 publications
(22 reference statements)
0
5
0
Order By: Relevance
“…In contrast with other approaches [22,23], we do not verify the functionality preservation of generated AEs, but we propose validating each modification individually before the generation process. Therefore, our approach is more timeefficient as it does not require discarding nonfunctional AEs during or at the end of the generation procedure.…”
Section: Validity Of Pe File Modificationsmentioning
confidence: 92%
See 1 more Smart Citation
“…In contrast with other approaches [22,23], we do not verify the functionality preservation of generated AEs, but we propose validating each modification individually before the generation process. Therefore, our approach is more timeefficient as it does not require discarding nonfunctional AEs during or at the end of the generation procedure.…”
Section: Validity Of Pe File Modificationsmentioning
confidence: 92%
“…Quertier et al in [23] used reinforcement learning algorithms to attack MalConv, GBDT by EMBER and Grayscale (convolutional neural network interpreting PE binaries as images) classifiers in grey-scale settings with available prediction scores for learning. Further, the authors targeted commercial AV in a pure black-box environment as well.…”
Section: Reinforcement Learning-based Attacksmentioning
confidence: 99%
“…Their code transformation process first define a set of actions performed on Windows PE header such as insert overlay bytes, packing and unpacking. MERLIN [14] and Pesidious [15] used actions techniques optimized by Reinforcement Learning algorithms to write agents that learn to manipulate PE files based on a reward provided by taking specific manipulation actions. Their code manipulation process first define a set of actions performed on Windows portable executable(PE) header such as insert overlay bytes, packing and unpacking, etc.…”
Section: Code Transformation Actionsmentioning
confidence: 99%
“…This characteristic almost makes the payload feature-space impracticable to discover an approximate or exact function that is differentiable [8][9][10][11][12][13]. Initial observations from literature [5,8,9,[12][13][14][15][16], point out that code transformation actions such as; appending semantic nop no instructions, insertion of jump instructions and replace existing instructions, when applied on a software or an execuatble file can obfuscate the file against pirating or lower the file's true positive rate. In this work, we enhanced these aforementioned code transformation actions with Dynamic Programming based search method-a reinforcement learning algorithm, to increase their evasive potency against static malware scanners whiles satisfying the behavior preserving criteria.…”
Section: Introductionmentioning
confidence: 99%
“…However, there is still very little work to directly apply reinforcement learning methods to malware detection. Although there are some works 7 , in order to further improve the detection efficiency and intelligence, consider adopting the solution in this article. Although the internal program structure of the malware itself is different, however, its malicious behavior must eventually be implemented into the actual dynamic behavior.…”
Section: Introductionmentioning
confidence: 99%