A privacy-constrained information extraction problem is considered where for a pair of correlated discrete random variables (X, Y) governed by a given joint distribution, an agent observes Y and wants to convey to a potentially public user as much information about Y as possible while limiting the amount of information revealed about X. To this end, the so-called rate-privacy function is investigated to quantify the maximal amount of information (measured in terms of mutual information) that can be extracted from Y under a privacy constraint between X and the extracted information, where privacy is measured using either mutual information or maximal correlation. Properties of the rate-privacy function are analyzed and its information-theoretic and estimation-theoretic interpretations are presented for both the mutual information and maximal correlation privacy measures. It is also shown that the rate-privacy function admits a closed-form expression for a large family of joint distributions of (X, Y). Finally, the rate-privacy function under the mutual information privacy measure is considered for the case where (X, Y) has a joint probability density function by studying the problem where the extracted information is a uniform quantization of Y corrupted by additive Gaussian noise. The asymptotic behavior of the rate-privacy function is studied as the quantization resolution grows without bound and it is observed that not all of the properties of the rate-privacy function carry over from the discrete to the continuous case.
We investigate the tradeoff between privacy and utility in a situation where both privacy and utility are measured in terms of mutual information. For the binary case, we fully characterize this tradeoff in case of perfect privacy and also give an upper-bound for the case where some privacy leakage is allowed. We then introduce a new quantity which quantifies the amount of private information contained in the observable data and then connect it to the optimal tradeoff between privacy and utility.
We investigate the problem of estimating a random variable Y under a privacy constraint dictated by another correlated random variable X. When X and Y are discrete, we express the underlying privacy-utility tradeoff in terms of the privacy-constrained guessing probability h(PXY , ε), the maximum probability Pc(Y |Z) of correctly guessing Y given an auxiliary random variable Z, where the maximization is taken over all P Z|Y ensuring that Pc(X|Z) ≤ ε for a given privacy threshold ε ≥ 0. We prove that h(PXY , ·) is concave and piecewise linear, which allows us to derive its expression in closed form for any ε when X and Y are binary. In the non-binary case, we derive h(PXY , ε) in the high utility regime (i.e., for sufficiently large, but nontrivial, values of ε) under the assumption that Y and Z have the same alphabets. We also analyze the privacy-constrained guessing probability for two scenarios in which X, Y and Z are binary vectors. When X and Y are continuous random variables, we formulate the corresponding privacy-utility tradeoff in terms of sENSR(PXY , ε), the smallest normalized minimum mean squared-error (mmse) incurred in estimating Y from a Gaussian perturbation Z. Here the minimization is taken over a family of Gaussian perturbations Z for which the mmse of f (X) given Z is within a factor 1 − ε from the variance of f (X) for any non-constant real-valued function f . We derive tight upper and lower bounds for sENSR when Y is Gaussian. For general absolutely continuous random variables, we obtain a tight lower bound for sENSR(PXY , ε) in the high privacy regime, i.e., for small ε.
Abstract-We investigate the problem of the predictability of random variable Y under a privacy constraint dictated by random variable X, correlated with Y , where both predictability and privacy are assessed in terms of the minimum meansquared error (MMSE). Given that X and Y are connected via a binary-input symmetric-output (BISO) channel, we derive the optimal random mapping P Z Y such that the MMSE of Y given Z is minimized while the MMSE of X given Z is greater than (1−ε)var(X) for a given ε ≥ 0. We also consider the case where (X, Y ) are continuous and P Z Y is restricted to be an additive noise channel.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.