2022
DOI: 10.1016/j.cose.2022.102846
|View full text |Cite
|
Sign up to set email alerts
|

An ensemble of pre-trained transformer models for imbalanced multiclass malware classification

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
5
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
8

Relationship

1
7

Authors

Journals

citations
Cited by 21 publications
(6 citation statements)
references
References 29 publications
0
5
0
Order By: Relevance
“…From Table 4, as can be observed, the first five methods give a baseline for classification tasks, and models based on malware images or traditional NLP methods can achieve accuracy around 0.90. The model of [21] achieved an accuracy of 0.90, the model of [48] achieved an accuracy of 0.93, and our model had an accuracy of 0.96. From a baseline perspective, all three models go beyond other basic or advanced methods and achieve a better result, demonstrating the effectiveness of the API calls and Encoder/Transformer architecture models.…”
Section: Comparison With Previous Methodsmentioning
confidence: 69%
See 2 more Smart Citations
“…From Table 4, as can be observed, the first five methods give a baseline for classification tasks, and models based on malware images or traditional NLP methods can achieve accuracy around 0.90. The model of [21] achieved an accuracy of 0.90, the model of [48] achieved an accuracy of 0.93, and our model had an accuracy of 0.96. From a baseline perspective, all three models go beyond other basic or advanced methods and achieve a better result, demonstrating the effectiveness of the API calls and Encoder/Transformer architecture models.…”
Section: Comparison With Previous Methodsmentioning
confidence: 69%
“…The first five methods are classic methods [14,[44][45][46][47] to do the malware family classification, and we report the results from their papers. The following five methods [16,20,21,23,48] are the latest effective work on the classification based on API calls, so we reproduce the methods and offer a convincing comparison result. The [21] method adopts a two-way feature extraction architecture for API calls, but the core module is a multi-layer CNN, and the correlation analysis is performed through Bi-LSTM.…”
Section: Comparison With Previous Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…We compiled and analyzed these results as reference data, and the comparative outcomes are illustrated in Table 4. From Table 4, Demirkıran et al [43] tested multiple pre-trained models, including BERT, CA-NINE-S, and their proposed Random Transformer Forest (RTF) model, on the mal-api-2019 dataset. Li et al [44] utilized RNNs and transformer structures to classify malware by learning interactive features in API call sequences.…”
Section: Baselines Comparative Evaluationmentioning
confidence: 99%
“…Pre-trained transformer models, such as BERT, have become popular, offering fine-tuning for tasks like question answering, text classification, and language generation (Casola et al, 2022;Min et al, 2021;Qiu et al, 2020). These models have also found applications in cybersecurity, including cyberbullying detection, malware classification, and API call-based malware detection (Demirkiran et al, 2022;Oak et al, 2019;Paul & Saha, 2022).…”
Section: Salo Et Al Introduce a Hybrid Dimensionality Reduction Techn...mentioning
confidence: 99%