Abstract. Together with the massive expansion of smartphones, tablets, and other smart devices, we can notice a growing number of malware threats targeting these platforms. Software security companies are not prepared for such diversity of target platforms and there are only few techniques for platform-independent malware analysis. This is a major security issue these days. In this paper, we propose a concept of a retargetable reverse compiler (i.e. a decompiler), which is in an early stage of development. The retargetable decompiler transforms platformspecific binary applications into a high-level language (HLL) representation, which can be further analyzed in a uniform way. This tool will help with a static platform-independent malware analysis. Our unique solution is based on an exploitation of two systems that were originally not intended for such an application-the architecture description language (ADL) ISAC for a platform description and the LLVM Compiler System as the core of the decompiler. In this study, we show that our tool can produce highly readable HLL code.
Detection of the statically linked code is one of the important steps in a process of decompilation. It restricts a code, which can be skipped by the decompiler. Type annotations provide an additional information about the number, types, and suitable names for arguments and return values of recognized functions in recognized statically linked code. This is important for generation of calls for these functions. The detection is based on the generic signatures, which are created from the static libraries. The signatures are composed of the first bytes of library modules, CRC codes, module sizes, public symbols, and optionally tail bits or references. A tree structure of signature improves performance by decreasing a count of compared bytes. Generic approach of detection is achieved by an usage of a common object file format. The process is not restricted on specific architecture or file format. However, there are situations when a conflict in the detection can be resolved only by an analysis in the decompiler. Impact of signature usage is verified by the tests with the decompiler.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.