2019
DOI: 10.1016/j.comcom.2019.06.013
|View full text |Cite
|
Sign up to set email alerts
|

Unsupervised field segmentation of unknown protocol messages

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2

Citation Types

0
4
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 15 publications
(4 citation statements)
references
References 12 publications
0
4
0
Order By: Relevance
“…For example, ASAP [17] maps the message payloads to the vector space by constructing the marked letters derived from the separator and n-gram, and uses matrix factorization [34] to identify the basic direction and coordinate tuples to cluster different protocol messages. Sun et al [25] defines Token Format Distance (TFD) and Message Format Distance (MFD) by introducing basic rules of Augmented Backus Naur Form (ABNF) [35] to calculate protocol message distances, then uses the DBSCAN algorithm [36] to cluster protocol messages, and uses Silhouette Coefficient and Dunn Validity Index [37] to determine the best clustering parameters to improve the quality of clustering performance.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…For example, ASAP [17] maps the message payloads to the vector space by constructing the marked letters derived from the separator and n-gram, and uses matrix factorization [34] to identify the basic direction and coordinate tuples to cluster different protocol messages. Sun et al [25] defines Token Format Distance (TFD) and Message Format Distance (MFD) by introducing basic rules of Augmented Backus Naur Form (ABNF) [35] to calculate protocol message distances, then uses the DBSCAN algorithm [36] to cluster protocol messages, and uses Silhouette Coefficient and Dunn Validity Index [37] to determine the best clustering parameters to improve the quality of clustering performance.…”
Section: Related Workmentioning
confidence: 99%
“…Therefore, in this paper, we mainly divide a variety of unknown protocol messages into different clusters, which will facilitate future protocol reverse work. Effective protocol message clustering necessitates the resolution of two critical issues: the measurement of protocol message distance and the design of clustering algorithm [25]. It is worth noting that the message distance is the basis of protocol clustering.…”
Section: Introductionmentioning
confidence: 99%
“…Netzob [3], Discoverer [5], and others [15] deduce fields as a by-product of sequence alignment with the already mentioned disadvantages. Existing statistical methods either require an already existing segmentation [3,5,15] or expect field boundaries at globally fixed positions [2,26,27], limiting the applicability to protocols specifically designed without variable length fields. If meta-data and common offsets of values in messages are available, the task is as simple as finding the corresponding or correlating values in the messages.…”
Section: Related Workmentioning
confidence: 99%
“…The general process consists of three phases: syntax inference, semantics inference, and state machine inference, which represents the order in which message types are transmitted. For semantics and state machine inference to be successful, syntax inference must be performed correctly, and accurate keyword extraction is crucial for accomplishing correct syntax inference 4–9 . In this paper, we use the term “keyword” to refer to a value that one field can have; accurate keyword extraction refers to the process of extracting values that exactly one field can have, not noise such as a combination of values from two or more fields or a portion of the value of one field.…”
Section: Introductionmentioning
confidence: 99%