In the past decade, a new class of cyber-threats, known as "Advanced Persistent Threat" (APT), has emerged and has been used by different organizations to perform dangerous and effective attacks against financial and politic entities, critical infrastructures, and so on. To identify APT related malware early, a semi-automatic approach for malware samples analysis is needed. Recently, a malware triage step for a semi-automatic malware analysis architecture has been introduced. This step identifies incoming APT samples early, among all the malware delivered per day in the cyber-space, to immediately dispatch them to deeper analysis. In the article, the authors have built the knowledge base on known APTs obtained from publicly available reports. For efficiency reasons, they rely on static malware features, extracted with negligible delay, and use machine learning techniques for the identification. Unfortunately, the proposed solution has the disadvantage of requiring a long training time and needs to be completely retrained each time new APT samples or even a new APT class are discovered. In this article, we move from multi-class classification to a group of one-class classifiers, which significantly decreases runtime and allows higher modularity, while still guaranteeing precision and accuracy over 90%. CCS Concepts: • Social and professional topics → Malware/spyware crime;
Understanding the behavior of malware requires a semi-automatic approach including complex software tools and human analysts in the loop. However, the huge number of malicious samples developed daily calls for some prioritization mechanism to carefully select the samples that really deserve to be further examined by analysts. This avoids computational resources be overloaded and human analysts saturated. In this paper we introduce a malware triage stage where samples are quickly and automatically examined to promptly decide whether they should be immediately dispatched to human analysts or to other specific automatic analysis queues, rather than following the common and slow analysis pipeline. Such triage stage is encapsulated into an architecture for semi-automatic malware analysis presented in a previous work. In this paper we propose an approach for sample prioritization, and its realization within such architecture. Our analysis in the paper focuses on malware developed by Advanced Persistent Threats (APTs). We build our knowledge base, used in the triage, on known APTs obtained from publicly available reports. To make the triage as fast as possible, only static malware features are considered, which can be extracted with negligible delay, without the necessity of executing the malware samples, and we use them to train a random forest classifier. The classifier has been tuned to maximize its precision, so that analysts and other components of the architecture are mostly likely to receive only malware correctly identified as being similar to known APT, and do not waste important resources on false positives. A preliminary analysis shows high precision and accuracy, as desired.
Critical Infrastructures (CIs) are among the main targets of activists, cyber terrorists and state sponsored attacks. To protect itself, a CI needs to build and keep updated a domestic knowledge base of cyber threats. It cannot indeed completely rely on external service providers because information on incidents can be so sensible to impact national security. In this paper, we propose an architecture for a malware analysis framework to support CIs in such a challenging task. Given the huge number of new malware produced daily, the architecture is designed so as to automate the analysis to a large extent, leaving to human analysts only a small and manageable part of the whole effort. Such a non-automatic part of the analysis requires a wide range of expertise, usually contributed by more analysts. The architecture enables analysts to work collaboratively to improve the understanding of samples that demand deeper investigations (intra-CI collaboration). Furthermore, the architecture allows to share partial and configurable views of the knowledge base with other interested CIs, in order to collectively obtain a more complete vision of the cyber threat landscape (inter-CI collaboration)
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.