We derive uniformly most powerful (UMP) tests for simple and one-sided hypotheses for a population proportion within the framework of Differential Privacy (DP), optimizing finite sample performance. We show that in general, DP hypothesis tests can be written in terms of linear constraints, and for exchangeable data can always be expressed as a function of the empirical distribution. Using this structure, we prove a 'Neyman-Pearson lemma' for binomial data under DP, where the DP-UMP only depends on the sample sum. Our tests can also be stated as a post-processing of a random variable, whose distribution we coin "Truncated-Uniform-Laplace" (Tulap), a generalization of the Staircase and discrete Laplace distributions. Furthermore, we obtain exact p-values, which are easily computed in terms of the Tulap random variable.Using the above techniques, we show that our tests can be applied to give uniformly most accurate one-sided confidence intervals and optimal confidence distributions. We also derive uniformly most powerful unbiased (UMPU) two-sided tests, which lead to uniformly most accurate unbiased (UMAU) two-sided confidence intervals. We show that our results can be applied to distribution-free hypothesis tests for continuous data. Our simulation results demonstrate that all our tests have exact type I error, and are more powerful than current techniques.
The Tutte polynomial is a fundamental invariant of graphs. In this article, we define and study a generalization of the Tutte polynomial for directed graphs, that we name the B-polynomial. The B-polynomial has three variables, but when specialized to the case of graphs (that is, digraphs where arcs come in pairs with opposite directions), one of the variables becomes redundant and the B-polynomial is equivalent to the Tutte polynomial. We explore various properties, expansions, specializations, and generalizations of the B-polynomial, and try to answer the following questions:• what properties of the digraph can be detected from its B-polynomial (acyclicity, length of directed paths, number of strongly connected components, etc.)? • which of the marvelous properties of the Tutte polynomial carry over to the directed graph setting? The B-polynomial generalizes the strict chromatic polynomial of mixed graphs introduced by Beck, Bogart and Pham. We also consider a quasisymmetric function version of the B-polynomial which simultaneously generalizes the Tutte symmetric function of Stanley and the quasisymmetric chromatic function of Shareshian and Wachs.
Differential privacy (DP), provides a framework for provable privacy protection against arbitrary adversaries, while allowing the release of summary statistics and synthetic data. We address the problem of releasing a noisy real-valued statistic vector T , a function of sensitive data under DP, via the class of K-norm mechanisms with the goal of minimizing the noise added to achieve privacy. First, we introduce the sensitivity space of T , which extends the concepts of sensitivity polytope and sensitivity hull to the setting of arbitrary statistics T . We then propose a framework consisting of three methods for comparing the K-norm mechanisms: 1) a multivariate extension of stochastic dominance, 2) the entropy of the mechanism, and 3) the conditional variance given a direction, to identify the optimal K-norm mechanism. In all of these criteria, the optimal K-norm mechanism is generated by the convex hull of the sensitivity space. Using our methodology, we extend the objective perturbation and functional mechanisms and apply these tools to logistic and linear regression, allowing for private releases of statistical results. Via simulations and an application to a housing price dataset, we demonstrate that our proposed methodology offers a substantial improvement in utility for the same level of risk.
f -DP has recently been proposed as a generalization of classical definitions of differential privacy allowing a lossless analysis of composition, post-processing, and privacy amplification via subsampling. In the setting of f -DP, we propose the concept canonical noise distribution (CND) which captures whether an additive privacy mechanism is appropriately tailored for a given f , and give a construction that produces a CND given an arbitrary tradeoff function f . We show that private hypothesis tests are intimately related to CNDs, allowing for the release of private p-values at no additional privacy cost as well as the construction of uniformly most powerful (UMP) tests for binary data.We apply our techniques to the problem of difference of proportions testing, and construct a UMP unbiased "semi-private" test which upper bounds the performance of any DP test. Using this as a benchmark we propose a private test, based on the inversion of characteristic functions, which allows for optimal inference for the two population parameters and is nearly as powerful as the semi-private UMPU. When specialized to the case of ( , 0)-DP, we show empirically that our proposed test is more powerful than any ( / √ 2)-DP test and has more accurate type I errors than the classic normal approximation test.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.