As machine learning becomes widely used for automated decisions, attackers have strong incentives to manipulate the results and models generated by machine learning algorithms. In this paper, we perform the first systematic study of poisoning attacks and their countermeasures for linear regression models. In poisoning attacks, attackers deliberately influence the training data to manipulate the results of a predictive model. We propose a theoretically-grounded optimization framework specifically designed for linear regression and demonstrate its effectiveness on a range of datasets and models. We also introduce a fast statistical attack that requires limited knowledge of the training process. Finally, we design a new principled defense method that is highly resilient against all poisoning attacks. We provide formal guarantees about its convergence and an upper bound on the effect of poisoning attacks when the defense is deployed. We evaluate extensively our attacks and defenses on three realistic datasets from health care, loan assessment, and real estate domains. 1
Recent targeted attacks have increased significantly in sophistication, undermining the fundamental assumptions on which most cryptographic primitives rely for security. For instance, attackers launching an Advanced Persistent Threat (APT) can steal full cryptographic keys, violating the very secrecy of "secret" keys that cryptographers assume in designing secure protocols. In this article, we introduce a game-theoretic framework for modeling various computer security scenarios prevalent today, including targeted attacks. We are particularly interested in situations in which an attacker periodically compromises a system or critical resource completely, learns all its secret information and is not immediately detected by the system owner or defender.We propose a two-player game between an attacker and defender called FL I PIT or The Game of "Stealthy Takeover." In FL I PIT, players compete to control a shared resource. Unlike most existing games, FL I PIT allows players to move at any given time, taking control of the resource. The identity of the player controlling the resource, however, is not revealed until a player actually moves. To move, a player pays a certain move cost. The objective of each player is to control the resource a large fraction of time, while minimizing his total move cost. FL I PIT provides a simple and elegant framework in which we can formally reason about the interaction between attackers and defenders in practical scenarios.In this article, we restrict ourselves to games in which one of the players (the defender) plays with a renewal strategy, one in which the intervals between consecutive moves are chosen independently and uniformly at random from a fixed probability distribution. We consider attacker strategies ranging in increasing sophistication from simple periodic strategies (with moves spaced at equal time intervals) to more complex adaptive strategies, in which moves are determined based on feedback received during the game. For different classes of strategies employed by the attacker, we determine strongly dominant strategies for both players (when they exist), strategies that achieve higher benefit than all other strategies in a particular class. When strongly dominant strategies do not exist, our goal is to characterize the residual game consisting of strategies that are not strongly dominated by other strategies. We also prove equivalence or strict inclusion of certain classes of strategies under different conditions. Our analysis of different FL I PIT variants teaches cryptographers, system designers, and the community at large some valuable lessons:1. Systems should be designed under the assumption of repeated total compromise, including theft of cryptographic keys. FL I PIT provides guidance on how to implement a cost-effective defensive strategy.2. Aggressive play by one player can motivate the opponent to drop out of the game (essentially not to play at all). Therefore, moving fast is a good defensive strategy, but it can only be implemented if move costs are lo...
Security is a major barrier to enterprise adoption of cloud computing. Physical co-residency with other tenants poses a particular risk, due to pervasive virtualization in the cloud. Recent research has shown how side channels in shared hardware may enable attackers to exfiltrate sensitive data across virtual machines (VMs). In view of such risks, cloud providers may promise physically isolated resources to select tenants, but a challenge remains: Tenants still need to be able to verify physical isolation of their VMs.We introduce HomeAlone, a system that lets a tenant verify its VMs' exclusive use of a physical machine. The key idea in HomeAlone is to invert the usual application of side channels. Rather than exploiting a side channel as a vector of attack, HomeAlone uses a side-channel (in the L2 memory cache) as a novel, defensive detection tool. By analyzing cache usage during periods in which "friendly" VMs coordinate to avoid portions of the cache, a tenant using HomeAlone can detect the activity of a co-resident "foe" VM. Key technical contributions of HomeAlone include classification techniques to analyze cache usage and guest operating system kernel modifications that minimize the performance impact of friendly VMs sidestepping monitored cache portions. HomeAlone requires no modification of existing hypervisors and no special action or cooperation by the cloud provider.
It has become common to publish large (billion parameter) language models that have been trained on private datasets. This paper demonstrates that in such settings, an adversary can perform a training data extraction attack to recover individual training examples by querying the language model. We demonstrate our attack on GPT-2, a language model trained on scrapes of the public Internet, and are able to extract hundreds of verbatim text sequences from the model's training data. These extracted examples include (public) personally identifiable information (names, phone numbers, and email addresses), IRC conversations, code, and 128-bit UUIDs. Our attack is possible even though each of the above sequences are included in just one document in the training data.We comprehensively evaluate our extraction attack to understand the factors that contribute to its success. For example, we find that larger models are more vulnerable than smaller models. We conclude by drawing lessons and discussing possible safeguards for training large language models.
Abstract-RecentWe address the problem of detecting early-stage infection in an enterprise setting by proposing a new framework based on belief propagation inspired from graph theory. Belief propagation can be used either with "seeds" of compromised hosts or malicious domains (provided by the enterprise security operation center -SOC) or without any seeds. In the latter case we develop a detector of C&C communication particularly tailored to enterprises which can detect a stealthy compromise of only a single host communicating with the C&C server.We demonstrate that our techniques perform well on detecting enterprise infections. We achieve high accuracy with low false detection and false negative rates on two months of anonymized DNS logs released by Los Alamos National Lab (LANL), which include APT infection attacks simulated by LANL domain experts. We also apply our algorithms to 38TB of real-world web proxy logs collected at the border of a large enterprise. Through careful manual investigation in collaboration with the enterprise SOC, we show that our techniques identified hundreds of malicious domains overlooked by state-of-the-art security products.
Abstract. Most cryptographic systems carry the basic assumption that entities are able to preserve the secrecy of their keys. With attacks today showing ever increasing sophistication, however, this tenet is eroding. "Advanced Persistent Threats" (APTs), for instance, leverage zero-day exploits and extensive system knowledge to achieve full compromise of cryptographic keys and other secrets. Such compromise is often silent, with defenders failing to detect the loss of private keys critical to protection of their systems. The growing virulence of today's threats clearly calls for new models of defenders' goals and abilities. In this paper, we explore applications of FL I PIT, a novel game-theoretic model of system defense introduced in [14]. In FL I PIT, an attacker periodically gains complete control of a system, with the unique feature that system compromises are stealthy, i.e., not immediately detected by the system owner, called the defender. We distill out several lessons from our study of FL I PIT and demonstrate their application to several real-world problems, including password reset policies, key rotation, VM refresh and cloud auditing.
This paper presents a new challenge-verifying that a remote server is storing a file in a fault-tolerant manner, i.e., such that it can survive hard-drive failures. We describe an approach called the Remote Assessment of Fault Tolerance (RAFT). The key technique in a RAFT is to measure the time taken for a server to respond to a read request for a collection of file blocks. The larger the number of hard drives across which a file is distributed, the faster the read-request response. Erasure codes also play an important role in our solution. We describe a theoretical framework for RAFTs and show experimentally that RAFTs can work in practice.
Extending the data trust perimeter from the enterprise to the public cloud requires more than encryption.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.