Traditional TCP/IP fingerprinting tools (e.g., nmap) are poorly suited for Internet-wide use due to the large amount of traffic and intrusive nature of the probes. This can be overcome by approaches that rely on a single SYN packet to elicit a vector of features from the remote server; however, these methods face difficult classification problems due to the high volatility of the features and severely limited amounts of information contained therein. Since these techniques have not been studied before, we first pioneer stochastic theory of single-packet OS fingerprinting, build a database of 116 OSes, design a classifier based on our models, evaluate its accuracy in simulations, and then perform OS classification of 37.8M hosts from an Internet-wide scan.
Machine-generated text presents a potential threat not only to the public sphere, but also to the scientific enterprise, whereby genuine research is undermined by convincing, synthetic text. In this paper we examine the problem of detecting GPT-2-generated technical research text. We first consider the realistic scenario where the defender does not have full information about the adversary's text generation pipeline, but is able to label small amounts of in-domain genuine and synthetic text in order to adapt to the target distribution. Even in the extreme scenario of adapting a physicsdomain detector to a biomedical detector, we find that only a few hundred labels are sufficient for good performance. Finally, we show that paragraph-level detectors can be used to detect the tampering of full-length documents under a variety of threat models.
Maintaining and updating signature databases are tedious tasks that normally require a large amount of user effort. The problem becomes harder when features can be distorted by observation noise, which we call volatility. To address this issue, we propose algorithms and models to automatically generate signatures in the presence of noise, with a focus on single-probe stack fingerprinting, which is a research area that aims to discover the operating system of remote hosts using their response to a TCP SYN packet. Armed with this framework, we construct a database with 420 network stacks, label the signatures, develop a robust classifier for this database, and fingerprint 66M visible webservers on the Internet. We compare the obtained results against Nmap and discover interesting limitations of its classification process that prevent correct operation when its auxiliary probes (e.g., TCP rainbow, TCP ACK, and UDP to a closed port) are blocked by firewalls.Index Terms-OS fingerprinting, internet measurement.
Traditional TCP/IP fingerprinting tools (e.g., nmap) are poorly suited for Internet-wide use due to the large amount of traffic and intrusive nature of the probes. This can be overcome by approaches that rely on a single SYN packet to elicit a vector of features from the remote server; however, these methods face difficult classification problems due to the high volatility of the features and severely limited amounts of information contained therein. Since these techniques have not been studied before, we first pioneer stochastic theory of single-packet OS fingerprinting, build a database of 116 OSes, design a classifier based on our models, evaluate its accuracy in simulations, and then perform OS classification of 37.8M hosts from an Internet-wide scan.
Traditional TCP/IP fingerprinting tools (e.g., nmap) are poorly suited for Internet-wide use due to the large amount of traffic and intrusive nature of the probes. This can be overcome by approaches that rely on a single SYN packet to elicit a vector of features from the remote server; however, these methods face difficult classification problems due to the high volatility of the features and severely limited amounts of information contained therein. Since these techniques have not been studied before, we first pioneer stochastic theory of single-packet OS fingerprinting, build a database of 116 OSes, design a classifier based on our models, evaluate its accuracy in simulations, and then perform OS classification of 37.8M hosts from an Internet-wide scan.
Maintaining and updating signature databases is a tedious task that normally requires a large amount of user effort. The problem becomes harder when features can be distorted by observation noise, which we call volatility. To address this issue, we propose algorithms and models to automatically generate signatures in the presence of noise, with a focus on stack fingerprinting, which is a research area that aims to discover the operating system (OS) of remote hosts using TCP/IP packets. Armed with this framework, we construct a database with 420 network stacks, label the signatures, develop a robust classifier for this database, and fingerprint 66M visible webservers on the Internet.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.