Abstract:As the essential component responsible for communication, network services are security-critical, and it is vital to find v u lnerabilities i n t h em. F u zzing i s c u rrently o n e o f the most popular software vulnerability discovery techniques, widely adopted due to its high efficiency and low false positives. However, existing coverage-guided fuzzers mainly aim at stateless local applications, leaving stateful network services underexplored. Recently, some fuzzers targeting network services have been pro… Show more
“…Our results for all text-based protocols in the PRO-FUZZBENCH protocol fuzzer benchmark [33] demonstrate the effectiveness of the LLM-guided approach: Compared to the baseline (AFLNET [36]) into which our approach was implemented, our tool CHATAFL covers almost 50% more state transitions, 30% more states, and 6% more code. CHATAFL shows similar improvements over the state-of-the-art (NSFUZZ [38]). In our ablation study, starting from the baseline we found that enabling (i) the grammar extraction, (ii) the seed enrichment, and (iii) the saturation handler one by one allows CHATAFL to achieve the same code coverage 2.0, 4.6, and 6.1 times faster, respectively, as the baseline achieves in 24 hours.…”
Section: Introductionmentioning
confidence: 68%
“…A mutation-based protocol fuzzer [36], [38] uses a set of pre-recorded message sequences as seed inputs for mutation. The recording ensures that the message structure and order are valid while mutational fuzzing will slightly corrupt both [36].…”
Section: A Protocol Fuzzingmentioning
confidence: 99%
“…The recording ensures that the message structure and order are valid while mutational fuzzing will slightly corrupt both [36]. In fact, all recently proposed protocol fuzzers, such as AFLNET [36] and NSFUZZ [38] follow this approach.…”
Section: A Protocol Fuzzingmentioning
confidence: 99%
“…LIVE555 implements RTSP in accordance with RFC 2326, functioning as a streaming server in entertainment and communications systems to manage streaming media servers. It is included in PROFUZZBENCH, a widely-used benchmark for stateful fuzzers of network protocols [36], [7], [38]. PRO-FUZZBENCH comprises a suite of representative open-source network servers for popular protocols, with LIVE555 being among them.…”
Section: Case Study: Testing the Capabilities Of Llmsmentioning
confidence: 99%
“…Mutation-based protocol fuzzing reduces the dependence on a machine-readable specification of that required message structure or order by fuzzing recorded message sequences [36], [38], [7], [32]. The simple mutations often preserve the required protocol while still corrupting the message sequences enough to expose errors.…”
How to find security flaws in a protocol implementation without a machine-readable specification of the protocol? Facing the internet, protocol implementations are particularly security-critical software systems where inputs must adhere to a specific structure and order that is often informally specified in hundreds of pages in natural language (RFC). Without some machine-readable version of that protocol, it is difficult to automatically generate valid test inputs for its implementation that follow the required structure and order. It is possible to partially alleviate this challenge using mutational fuzzing on a set of recorded message sequences as seed inputs. However, the set of available seeds is often quite limited and will hardly cover the great diversity of protocol states and input structures.In this paper, we explore the opportunities of systematic interaction with pre-trained large language models (LLMs), which have ingested millions of pages of human-readable protocol specifications, to draw out machine-readable information about the protocol that can be used during protocol fuzzing. We use the knowledge of the LLMs about protocol message types for well-known protocols. We also checked the LLM's capability in detecting "states" for stateful protocol implementations by generating sequences of messages and predicting response codes. Based on these observations, we have developed an LLM-guided protocol implementation fuzzing engine. Our protocol fuzzer CHATAFL constructs grammars for each message type in a protocol, and then mutates messages or predicts the next messages in a message sequence via interactions with LLMs. Experiments on a wide range of real-world protocols from PROFUZZBENCH show significant efficacy in state and code coverage. Our LLMguided stateful fuzzer was compared with state-of-the-art fuzzers AFLNET and NSFUZZ. CHATAFL covers 47.60% and 42.69% more state transitions, 29.55% and 25.75% more states, and 5.81% and 6.74% more code, respectively. Apart from enhanced coverage, CHATAFL discovered nine distinct and previously unknown vulnerabilities in widely-used and extensively-tested protocol implementations while AFLNET and NSFUZZ only discovered three and four of them, respectively.
“…Our results for all text-based protocols in the PRO-FUZZBENCH protocol fuzzer benchmark [33] demonstrate the effectiveness of the LLM-guided approach: Compared to the baseline (AFLNET [36]) into which our approach was implemented, our tool CHATAFL covers almost 50% more state transitions, 30% more states, and 6% more code. CHATAFL shows similar improvements over the state-of-the-art (NSFUZZ [38]). In our ablation study, starting from the baseline we found that enabling (i) the grammar extraction, (ii) the seed enrichment, and (iii) the saturation handler one by one allows CHATAFL to achieve the same code coverage 2.0, 4.6, and 6.1 times faster, respectively, as the baseline achieves in 24 hours.…”
Section: Introductionmentioning
confidence: 68%
“…A mutation-based protocol fuzzer [36], [38] uses a set of pre-recorded message sequences as seed inputs for mutation. The recording ensures that the message structure and order are valid while mutational fuzzing will slightly corrupt both [36].…”
Section: A Protocol Fuzzingmentioning
confidence: 99%
“…The recording ensures that the message structure and order are valid while mutational fuzzing will slightly corrupt both [36]. In fact, all recently proposed protocol fuzzers, such as AFLNET [36] and NSFUZZ [38] follow this approach.…”
Section: A Protocol Fuzzingmentioning
confidence: 99%
“…LIVE555 implements RTSP in accordance with RFC 2326, functioning as a streaming server in entertainment and communications systems to manage streaming media servers. It is included in PROFUZZBENCH, a widely-used benchmark for stateful fuzzers of network protocols [36], [7], [38]. PRO-FUZZBENCH comprises a suite of representative open-source network servers for popular protocols, with LIVE555 being among them.…”
Section: Case Study: Testing the Capabilities Of Llmsmentioning
confidence: 99%
“…Mutation-based protocol fuzzing reduces the dependence on a machine-readable specification of that required message structure or order by fuzzing recorded message sequences [36], [38], [7], [32]. The simple mutations often preserve the required protocol while still corrupting the message sequences enough to expose errors.…”
How to find security flaws in a protocol implementation without a machine-readable specification of the protocol? Facing the internet, protocol implementations are particularly security-critical software systems where inputs must adhere to a specific structure and order that is often informally specified in hundreds of pages in natural language (RFC). Without some machine-readable version of that protocol, it is difficult to automatically generate valid test inputs for its implementation that follow the required structure and order. It is possible to partially alleviate this challenge using mutational fuzzing on a set of recorded message sequences as seed inputs. However, the set of available seeds is often quite limited and will hardly cover the great diversity of protocol states and input structures.In this paper, we explore the opportunities of systematic interaction with pre-trained large language models (LLMs), which have ingested millions of pages of human-readable protocol specifications, to draw out machine-readable information about the protocol that can be used during protocol fuzzing. We use the knowledge of the LLMs about protocol message types for well-known protocols. We also checked the LLM's capability in detecting "states" for stateful protocol implementations by generating sequences of messages and predicting response codes. Based on these observations, we have developed an LLM-guided protocol implementation fuzzing engine. Our protocol fuzzer CHATAFL constructs grammars for each message type in a protocol, and then mutates messages or predicts the next messages in a message sequence via interactions with LLMs. Experiments on a wide range of real-world protocols from PROFUZZBENCH show significant efficacy in state and code coverage. Our LLMguided stateful fuzzer was compared with state-of-the-art fuzzers AFLNET and NSFUZZ. CHATAFL covers 47.60% and 42.69% more state transitions, 29.55% and 25.75% more states, and 5.81% and 6.74% more code, respectively. Apart from enhanced coverage, CHATAFL discovered nine distinct and previously unknown vulnerabilities in widely-used and extensively-tested protocol implementations while AFLNET and NSFUZZ only discovered three and four of them, respectively.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.