2016
DOI: 10.1007/s11416-016-0283-1
|View full text |Cite
|
Sign up to set email alerts
|

An investigation of byte n-gram features for malware classification

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

1
94
0

Year Published

2017
2017
2023
2023

Publication Types

Select...
3
3
2

Relationship

1
7

Authors

Journals

citations
Cited by 120 publications
(108 citation statements)
references
References 29 publications
1
94
0
Order By: Relevance
“…In such a case it is generally beneficial to perform feature selection on the n-grams extracted due to computational constraints and to reduce the impact from the curse of dimensionality [1,2]. This can be done by using information-gain or simply removing features that do not reach a minimum frequency [25,41,45]. Extracting the 328 bytes from our headers of interest significantly reduces the amount of data to process, increasing the flexibility of what we can experiment with when using n-grams.…”
Section: Appendix a N-gram Detailsmentioning
confidence: 99%
“…In such a case it is generally beneficial to perform feature selection on the n-grams extracted due to computational constraints and to reduce the impact from the curse of dimensionality [1,2]. This can be done by using information-gain or simply removing features that do not reach a minimum frequency [25,41,45]. Extracting the 328 bytes from our headers of interest significantly reduces the amount of data to process, increasing the flexibility of what we can experiment with when using n-grams.…”
Section: Appendix a N-gram Detailsmentioning
confidence: 99%
“…Intuitively, we expect that as the training data becomes more generic, the models will become less accurate, and our results do indeed support this intuition. We believe that the results that we provide in this paper cast the work presented in (Raff et al, 2016) in a much different light, namely, that the inability to construct a strong model based on the extremely diverse and generic data follows immediately from the generality of the data itself, rather than being an inherent weakness of a particular feature, such as n-grams.…”
Section: Introductionmentioning
confidence: 84%
“…Previous research has shown that a variety of techniques based on byte n-grams can achieve relatively high accuracies for the detection problem (Liangboonprakong and Sornil, 2013; Reddy and Pujari, 2006;Shabtai et al, 2009;Tabish et al, 2009). However, a recent study based on n-gram analysis rejects this view and argues that n-grams promote a gross level of overfitting (Raff et al, 2016).…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…-We show least squares regression in the form of the ELM with a non-linear kernel can provide a template to fully enhance the feature space rather than implicit feature selection of the regressor used in [17]. The Malytics generalization performance for unseen data also shows the effectiveness of the applied regularization technique.…”
Section: Introductionmentioning
confidence: 95%