Dainius Čeponis scite author profile

Classical signature-based attack detection methods demonstrate stagnation and inability to fight the zero-day and similar attacks, while anomaly-based detection methods are still characterized by huge numbers of false-positives. The progress achieved in recent years in the area of deep learning techniques provide a potential for renewing investigations on anomaly-based intrusion detection system training. While network-based intrusion detection systems have datasets for training, host-based intrusion detection systems researchers lack this component. Most datasets are created for Linux OS and the latest Windows OS dataset was introduced in 2013 and included only minimal collection of system calls' features. In this article we propose a method for automated system-level anomaly dataset generation that is to be used in further artificial intelligence-based host-based intrusion detection systems training as well as our generated exhaustive collection of Windows OS malware-based system calls, that also includes additional information on malware activity. Main characteristics of the dataset are presented.

Evaluation of Deep Learning Methods Efficiency for Malicious and Benign System Calls Classification on the AWSCTD

Security and Communication Networks

Goranin

2019

The increasing amount of malware and cyberattacks on a host level increases the need for a reliable anomaly-based host IDS (HIDS) that would be able to deal with zero-day attacks and would ensure low false alarm rate (FAR), which is critical for the detection of such activity. Deep learning methods such as convolutional neural networks (CNNs) and recurrent neural networks (RNNs) are considered to be highly suitable for solving data-driven security solutions. Therefore, it is necessary to perform the comparative analysis of such methods in order to evaluate their efficiency in attack classification as well as their ability to distinguish malicious and benign activity. In this article, we present the results achieved with the AWSCTD (attack-caused Windows OS system calls traces dataset), which can be considered as the most exhaustive set of host-level anomalies at the moment, including 112.56 million system calls from 12110 executable malware samples and 3145 benign software samples with 16.3 million system calls. The best results were obtained with CNNs with up to 90.0% accuracy for family classification and 95.0% accuracy for malicious/benign determination. RNNs demonstrated slightly inferior results. Furthermore, CNN tuning via an increase in the number of layers should make them practically applicable for host-level anomaly detection.

Investigation of Dual-Flow Deep Learning Models LSTM-FCN and GRU-FCN Efficiency against Single-Flow CNN Models for the Host-Based Intrusion and Malware Detection Task on Univariate Times Series Data

Goranin

2020

Applied Sciences

Intrusion and malware detection tasks on a host level are a critical part of the overall information security infrastructure of a modern enterprise. While classical host-based intrusion detection systems (HIDS) and antivirus (AV) approaches are based on change monitoring of critical files and malware signatures, respectively, some recent research, utilizing relatively vanilla deep learning (DL) methods, has demonstrated promising anomaly-based detection results that already have practical applicability due low false positive rate (FPR). More complex DL methods typically provide better results in natural language processing and image recognition tasks. In this paper, we analyze applicability of more complex dual-flow DL methods, such as long short-term memory fully convolutional network (LSTM-FCN), gated recurrent unit (GRU)-FCN, and several others, for the task specified on the attack-caused Windows OS system calls traces dataset (AWSCTD) and compare it with vanilla single-flow convolutional neural network (CNN) models. The results obtained do not demonstrate any advantages of dual-flow models while processing univariate times series data and introducing unnecessary level of complexity, increasing training, and anomaly detection time, which is crucial in the intrusion containment process. On the other hand, the newly tested AWSCTD-CNN-static (S) single-flow model demonstrated three times better training and testing times, preserving the high detection accuracy.

Windows Api Hooking Libraries Research / Windows Api Funkcijų Sekų Perėmimo Bibliotekų Tyrimas

Radvilavicius

2011

The paper describes methods how to apply Windows API hooking with third party libraries and solutions. In this research were used Windows API function SetWindowsHookEx, Detours and EasyHook libraries. Libraries methods, features and advantages were discussed in this paper. The practical part contains libraries tests. In analysis we tested target program start with hooking library and injected function call.

Automated System-Level Anomaly Detection and Classification Using Modified Random Forest

Gyamfi

Goranin

2022

Method for Attack Tree Data Transformation and Import Into IT Risk Analysis Expert Systems

et al. 2020

Information technology (IT) security risk analysis preventatively helps organizations in identifying their vulnerable systems or internal controls. Some researchers propose expert systems (ES) as the solution for risk analysis automation since risk analysis by human experts is expensive and timely. By design, ES need a knowledge base, which must be up to date and of high quality. Manual creation of databases is also expensive and cannot ensure stable information renewal. These facts make the knowledge base automation process very important. This paper proposes a novel method of converting attack trees to a format usable by expert systems for utilizing the existing attack tree repositories in facilitating information and IT security risk analysis. The method performs attack tree translation into the Java Expert System Shell (JESS) format, by consistently applying ATTop, a software bridging tool that enables automated analysis of attack trees using a model-driven engineering approach, translating attack trees into the eXtensible Markup Language (XML) format, and using the newly developed ATES (attack trees to expert system) program, performing further XML conversion into JESS compatible format. The detailed method description, along with samples of attack tree conversion and results of conversion experiments on a significant number of attack trees, are presented and discussed. The results demonstrate the high method reliability rate and viability of attack trees as a source for the knowledge bases of expert systems used in the IT security risk analysis process.

Research of machine and deep learning methods application for host-level intrusion detection and classification

Čeponis¹

2021

Malware Detection Using Convolutional Neural Network, A Deep Learning Framework: Comparative Analysis

Gyamfi¹,

Goranin²,

Čeponis³

et al. 2022

JISIS

Malware detection is a quintessential task for every security for securing work stations, mobile devices, servers etc. This detection is mainly used for identifying malware that are causing malicious problems. The traditional detection system has a much lesser rate of detection rate and the chances of getting an error is higher as well. As the emerging technology revolutionized day by day, the usage of Deep Learning (DL) is highly influenced in these detection fields. So, this paper brings an effective DL based detection of malware in which the following are the stages: a) Data collection being carried from Malimg dataset, b) Pre-processing carried out to eliminate the unwanted noise from the dataset and passed to c) Feature extraction, where Principal Component Analysis (PCA) used for extracting required features, d) Feature selection where Particle Swarm Optimization (PSO) used for dimensionality reduction and finally passed for e) Classification where Convolutional Neural Network (CNN) used as a classifier for effective classification. These models are evaluated under measures like Accuracy, sensitivity, specificity, precision, recall, f1-score, TPR, FPR and detection rate over models like VGG16, VGG19, Densenet, Alexnent, Ensemble learning. The proposed system (D-WARE) gives much higher performance with a 96% accuracy.