Yonghwi Kwon scite author profile

In this paper, we develop a model based causality inference technique for audit logging that does not require any application instrumentation or kernel modification. It leverages a recent dynamic analysis, dual execution (LDX), that can infer precise causality between system calls but unfortunately requires doubling the resource consumption such as CPU time and memory consumption. For each application, we use LDX to acquire precise causal models for a set of primitive operations. Each model is a sequence of system calls that have interdependences , some of them caused by memory operations and hence implicit at the system call level. These models are described by a language that supports various complexity such as regular, context-free, and even context-sensitive. In production run, a novel parser is deployed to parse audit logs (without any enhancement) to model instances and hence derive causality. Our evaluation on a set of real-world programs shows that the technique is highly effective. The generated models can recover causality with 0% false-positives (FP) and false-negatives (FN) for most programs and only 8.3% FP and 5.2% FN in the worst cases. The models also feature excellent composibility, meaning that the models derived from primitive operations can be composed together to describe causality for large and complex real world missions. Applying our technique to attack investigation shows that the system-wide attack causal graphs are highly precise and concise, having better quality than the state-of-the-art.

show abstract

BDA: practical dependence analysis for binary executables by unbiased whole-program path sampling and per-path abstract interpretation

Zhang

You

Tao

et al. 2019

Proc. ACM Program. Lang.

View full text Add to dashboard Cite

Binary program dependence analysis determines dependence between instructions and hence is important for many applications that have to deal with executables without any symbol information. A key challenge is to identify if multiple memory read/write instructions access the same memory location. The state-of-the-art solution is the value set analysis (VSA) that uses abstract interpretation to determine the set of addresses that are possibly accessed by memory instructions. However, VSA is conservative and hence leads to a large number of bogus dependences and then substantial false positives in downstream analyses such as malware behavior analysis. Furthermore, existing public VSA implementations have difficulty scaling to complex binaries. In this paper, we propose a new binary dependence analysis called BDA enabled by a randomized abstract interpretation technique. It features a novel whole program path sampling algorithm that is not biased by path length, and a per-path abstract interpretation avoiding precision loss caused by merging paths in traditional analyses. It also provides probabilistic guarantees. Our evaluation on SPECINT2000 programs shows that it can handle complex binaries such as gcc whereas VSA implementations from the-state-of-art platforms have difficulty producing results for many SPEC binaries. In addition, the dependences reported by BDA are 75 and 6 times smaller than Alto, a scalable binary dependence analysis tool, and VSA, respectively, with only 0.19% of true dependences observed during dynamic execution missed (by BDA). Applying BDA to call graph generation and malware analysis shows that BDA substantially supersedes the commercial tool IDA in recovering indirect call targets and outperforms a state-of-the-art malware analysis tool Cuckoo by disclosing 3 times more hidden payloads. CCS Concepts: • Security and privacy → Software reverse engineering; • Software and its engineering → Interpreters; Automated static analysis.

show abstract

Probabilistic Disassembly

Miller

Kwon

Sun

et al. 2019

View full text Add to dashboard Cite

An Empirical Study of Bugs in WebAssembly Compilers

Romano

Kwon

Wang

2021

View full text Add to dashboard Cite

Dual Execution for On the Fly Fine Grained Execution Comparison

Kim

Kwon

Sumner

et al. 2015

SIGARCH Comput. Archit. News

View full text Add to dashboard Cite

Execution comparison has many applications in debugging, malware analysis, software feature identification, and intrusion detection. Existing comparison techniques have various limitations. Some can only compare at the system event level and require executions to take the same input. Some require storing instruction traces that are very space-consuming and have difficulty dealing with non-determinism. In this paper, we propose a novel dual execution technique that allows on-the-fly comparison at the instruction level. Only differences between the executions are recorded. It allows executions to proceed in a coupled mode such that they share the same input sequence with the same timing, reducing nondeterminism. It also allows them to proceed in a decoupled mode such that the user can interact with each one differently. Decoupled executions can be recoupled to share the same future inputs and facilitate further comparison. We have implemented a prototype and applied it to identifying functional components for reuse, comparative debugging with new GDB primitives, and understanding real world regression failures. Our results show that dual execution is a critical enabling technique for execution comparison.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Yonghwi Kwon

MCI : Modeling-based Causality Inference in Audit Logging for Attack Investigation

BDA: practical dependence analysis for binary executables by unbiased whole-program path sampling and per-path abstract interpretation

Probabilistic Disassembly

An Empirical Study of Bugs in WebAssembly Compilers

Dual Execution for On the Fly Fine Grained Execution Comparison

Contact Info

Product

Resources

About