Abstract.A software birthmark is a unique characteristic of a program that can be used as a software theft detection. In this paper we suggest and empirically evaluate a static birthmark of binary executables based on API call structure. The program properties employed in this birthmark are functions and standard API calls when the functions are executed. The API calls from a function includes the API calls explicitly found from the function and its descendants within limited depth in the call graph. To statically identify functions, call graphs and API calls, we utilizes IDAPro disassembler and its plug-ins. We define the similarity between two functions as the proportion of the number of all API calls to the number of the common API calls. The similarity between two programs is obtained by the maximum weight bipartite matching between two programs using the function similarity matrix. To show the credibility of the proposed techniques, we compare the same applications with different versions and the various types of applications which include text editors, picture viewers, multimedia players, P2P applications and ftp clients. To show the resilience, we compare binary executables compiled from various compilers. The empirical result shows that the similarities obtained using our birthmark sufficiently indicates the functional and structural similarities among programs.
SUMMARY A software birthmark means the inherent characteristics of a program that can be used to identify the program. A comparison of such birthmarks facilitates the detection of software theft. In this paper, we propose a static Java birthmark based on a set of stack patterns, which reflect the characteristic of Java applications. A stack pattern denotes a sequence of bytecodes that share their operands through the operand stack. A weight scheme is used to balance the influence of each bytecode in a comparison of the birthmarks. We evaluate the proposed birthmark with respect to two properties required for a birthmark: credibility and resilience. The empirical results show that the proposed birthmark is highly credible and resilient to program transformation. We also compare the proposed birthmark with existing birthmarks, such as that of Tamada et al. and the k-gram birthmark. The experimental results show that the proposed birthmark is more stable than the birthmarks in terms of resilience to program transformation. Thus, the proposed birthmark can provide more reliable evidence of software theft when the software is modified by someone other than author.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.