Differential privacy offers a way to answer queries about sensitive information while providing strong, provable privacy guarantees, ensuring that the presence or absence of a single individual in the database has a negligible statistical effect on the query's result. Proving that a given query has this property involves establishing a bound on the query's sensitivity-how much its result can change when a single record is added or removed.A variety of tools have been developed for certifying that a given query is differentially private. In one approach, Reed and Pierce [34] proposed a functional programming language, Fuzz, for writing differentially private queries. Fuzz uses linear types to track sensitivity and a probability monad to express randomized computation; it guarantees that any program with a certain type is differentially private. Fuzz can successfully verify many useful queries. However, it fails when the sensitivity analysis depends on values that are not known statically.We present DFuzz, an extension of Fuzz with a combination of linear indexed types and lightweight dependent types. This combination allows a richer sensitivity analysis that is able to certify a larger class of queries as differentially private, including ones whose sensitivity depends on runtime information. As in Fuzz, the differential privacy guarantee follows directly from the soundness theorem of the type system. We demonstrate the enhanced expressivity of DFuzz by certifying differential privacy for a broad class of iterative algorithms that could not be typed previously.
In this paper, we develop compositional methods for formally verifying differential privacy for algorithms whose analysis goes beyond the composition theorem. Our methods are based on the observation that differential privacy has deep connections with a generalization of probabilistic couplings, an established mathematical tool for reasoning about stochastic processes. Even when the composition theorem is not helpful, we can often prove privacy by a coupling argument. We demonstrate our methods on two algorithms: the Exponential mechanism and the Above Threshold algorithm, the critical component of the famous Sparse Vector algorithm. We verify these examples in a relational program logic apRHL+, which can construct approximate couplings. This logic extends the existing apRHL logic with more general rules for the Laplace mechanism and the one-sided Laplace mechanism, and new structural rules enabling pointwise reasoning about privacy; all the rules are inspired by the connection with coupling. While our paper is presented from a formal verification perspective, we believe that its main insight is of independent interest for the differential privacy community
Differential privacy is becoming a gold standard notion of privacy; it offers a guaranteed bound on loss of privacy due to release of query results, even under worst-case assumptions. The theory of differential privacy is an active research area, and there are now differentially private algorithms for a wide range of problems.However, the question of when differential privacy works in practice has received relatively little attention. In particular, there is still no rigorous method for choosing the key parameter ε, which controls the crucial tradeoff between the strength of the privacy guarantee and the accuracy of the published results.In this paper, we examine the role of these parameters in concrete applications, identifying the key considerations that must be addressed when choosing specific values. This choice requires balancing the interests of two parties with conflicting objectives: the data analyst, who wishes to learn something abou the data, and the prospective participant, who must decide whether to allow their data to be included in the analysis. We propose a simple model that expresses this balance as formulas over a handful of parameters, and we use our model to choose ε on a series of simple statistical studies. We also explore a surprising insight: in some circumstances, a differentially private study can be more accurate than a non-private study for the same cost, under our model. Finally, we discuss the simplifying assumptions in our model and outline a research agenda for possible refinements.
Linear logic is well known for its resource-awareness, which has inspired the design of several resource management mechanisms in programming language design. Its resource-awareness arises from the distinction between linear, single-use data and non-linear, reusable data. The latter is marked by the so-called exponential modality, which, from the categorical viewpoint, is a (monoidal) comonad.Monadic notions of computation are well-established mechanisms used to express effects in pure functional languages. Less well-established is the notion of comonadic computation. However, recent works have shown the usefulness of comonads to structure context dependent computations. In this work, we present a language RPCF inspired by a generalized interpretation of the exponential modality. In RPCF the exponential modality carries a label-an element of a semiring R-that provides additional information on how a program uses its context. This additional structure is used to express comonadic type analysis.
Mechanism design is the study of algorithm design where the inputs to the algorithm are controlled by strategic agents, who must be incentivized to faithfully report them. Unlike typical programmatic properties, it is not sufficient for algorithms to merely satisfy the property-incentive properties are only useful if the strategic agents also believe this fact.Verification is an attractive way to convince agents that the incentive properties actually hold, but mechanism design poses several unique challenges: interesting properties can be sophisticated relational properties of probabilistic computations involving expected values, and mechanisms may rely on other probabilistic properties, like differential privacy, to achieve their goals.We introduce a relational refinement type system, called HOARe 2 , for verifying mechanism design and differential privacy. We show that HOARe 2 is sound w.r.t. a denotational semantics, and correctly models p , δq-differential privacy; moreover, we show that it subsumes DFuzz, an existing linear dependent type system for differential privacy. Finally, we develop an SMT-based implementation of HOARe 2 and use it to verify challenging examples of mechanism design, including auctions and aggregative games, and new proposed examples from differential privacy.
Effects and coeffects are two general, complementary aspects of program behaviour. They roughly correspond to computations which change the execution context (effects) versus computations which make demands on the context (coeffects). Effectful features include partiality, non-determinism, input-output, state, and exceptions. Coeffectful features include resource demands, variable access, notions of linearity, and data input requirements.The effectful or coeffectful behaviour of a program can be captured and described via type-based analyses, with fine grained information provided by monoidal effect annotations and semiring coeffects. Various recent work has proposed models for such typed calculi in terms of graded (strong) monads for effects and graded (monoidal) comonads for coeffects.Effects and coeffects have been studied separately so far, but in practice many computations are both effectful and coeffectful, e.g., possibly throwing exceptions but with resource requirements. To remedy this, we introduce a new general calculus with a combined effect-coeffect system. This can describe both the changes and requirements that a program has on its context, as well as interactions between these effectful and coeffectful features of computation. The effect-coeffect system has a denotational model in terms of effect-graded monads and coeffect-graded comonads where interaction is expressed via the novel concept of graded distributive laws. This graded semantics unifies the syntactic type theory with the denotational model. We show that our calculus can be instantiated to describe in a natural way various different kinds of interaction between a program and its evaluation context.
We present a practical, differentially private algorithm for answering a large number of queries on high dimensional datasets. Like all algorithms for this task, ours necessarily has worst-case complexity exponential in the dimension of the data. However, our algorithm packages the computationally hard step into a concisely defined integer program, which can be solved non-privately using standard solvers. We prove accuracy and privacy theorems for our algorithm, and then demonstrate experimentally that our algorithm performs well in practice. For example, our algorithm can efficiently and accurately answer millions of queries on the Netflix dataset, which has over 17,000 attributes; this is an improvement on the state of the art by multiple orders of magnitude.
Differential privacy is a rigorous, worst-case notion of privacy-preserving computation. Informally, a probabilistic program is differentially private if the participation of a single individual in the input database has a limited effect on the program's distribution on outputs. More technically, differential privacy is a quantitative 2-safety property that bounds the distance between the output distributions of a probabilistic program on adjacent inputs. Like many 2-safety properties, differential privacy lies outside the scope of traditional verification techniques. Existing approaches to enforce privacy are based on intricate, non-conventional type systems, or customized relational logics. These approaches are difficult to implement and often cumbersome to use.We present an alternative approach that verifies differential privacy by standard, non-relational reasoning on nonprobabilistic programs. Our approach transforms a probabilistic program into a non-probabilistic program which simulates two executions of the original program. We prove that if the target program is correct with respect to a Hoare specification, then the original probabilistic program is differentially private. We provide a variety of examples from the differential privacy literature to demonstrate the utility of our approach. Finally, we compare our approach with existing verification techniques for privacy.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.