2023
DOI: 10.1109/tse.2022.3187689
|View full text |Cite
|
Sign up to set email alerts
|

Revisiting Binary Code Similarity Analysis Using Interpretable Feature Engineering and Lessons Learned

Abstract: Binary code similarity analysis (BCSA) is widely used for diverse security applications, including plagiarism detection, software license violation detection, and vulnerability discovery. Despite the surging research interest in BCSA, it is significantly challenging to perform new research in this field for several reasons. First, most existing approaches focus only on the end results, namely, increasing the success rate of BCSA, by adopting uninterpretable machine learning. Moreover, they utilize their own be… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
11
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3
3
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 22 publications
(13 citation statements)
references
References 139 publications
(372 reference statements)
0
11
0
Order By: Relevance
“…A review of related approaches has shown that binary similarity has profited from machine learning and that usage of machine learning is viable for deobfuscation tasks. Out of recent state-of-the-art function clone detection approaches, only some approaches deal with obfuscation [47], [13], [30], [34]. None of these approaches have tried to apply binary code similarity algorithms to counter obfuscation by virtualization.…”
Section: Obfuscated Function Clone Identificationmentioning
confidence: 99%
See 2 more Smart Citations
“…A review of related approaches has shown that binary similarity has profited from machine learning and that usage of machine learning is viable for deobfuscation tasks. Out of recent state-of-the-art function clone detection approaches, only some approaches deal with obfuscation [47], [13], [30], [34]. None of these approaches have tried to apply binary code similarity algorithms to counter obfuscation by virtualization.…”
Section: Obfuscated Function Clone Identificationmentioning
confidence: 99%
“…Research dealing with virtualization in the context of function clone detection is scarce. A recent survey [34] highlights the need for covering interprocedural virtualization obfuscators like Themida [57], or VMProtect [61], as the obfuscations applied by Hikari [64] do not appear to be more complex than cross-optimization binary similarity. OFCI does not solve interprocedural obfuscation either, but is intended as a stepping stone to expand research in this area.…”
Section: E Virtual Machine Analysismentioning
confidence: 99%
See 1 more Smart Citation
“…Second, the function signature of a program is an important feature for representing the function semantics and able to improve the performance of binary code similarity detection [22], however, previous works didn't design methods to extract the information in function signatures.…”
Section: Introductionmentioning
confidence: 99%
“…Li et al [10] grab intermediate information from the compiler to generate their disassembly ground truth, in an attempt to validate ground truth. Another useful paper is by Kim et al [8] that discusses a benchmark for binary code similarity analysis.…”
Section: Introductionmentioning
confidence: 99%