The problem of identifiability of finite mixtures of finite product measures is studied. A mixture model with K mixture components and L observed variables is considered, where each variable takes its value in a finite set with cardinality M . The variables are independent in each mixture component. The identifiability of a mixture model means the possibility of attaining the mixture components parameters by observing its mixture distribution. In this paper, we investigate fundamental relations between the identifiability of mixture models and the separability of their observed variables by introducing two types of separability: strongly and weakly separable variables. Roughly speaking, a variable is said to be separable, if and only if it has some differences among its probability distributions in different mixture components. We prove that mixture models are identifiable if the number of strongly separable variables is greater than or equal to 2K −1, independent form M . This fundamental threshold is shown to be tight, where a family of non-identifiable mixture models with less than 2K − 1 strongly separable variables is provided. We also show that mixture models are identifiable if they have at least 2K weakly separable variables. To prove these theorems, we introduce a particular polynomial, called characteristic polynomial, which translates the identifiability conditions to identity of polynomials and allows us to construct an inductive proof.
No abstract
In this paper, we study the problem of delay minimization in NFV-based networks. In such systems, the ultimate goal of any request is to compute a sequence of functions in the network, where each function can be computed at only a specific subset of network nodes. In conventional approaches, for each function, we choose one node from the corresponding subset of the nodes to compute that function. In contrast, in this work, we allow each function to be computed in more than one node, redundantly in parallel, to respond to a given request. We argue that such redundancy in computation not only improves the reliability of the network, but would also, perhaps surprisingly, reduce the overall transmission delay. In particular, we establish that by judiciously choosing the subset of nodes which compute each function, in conjunction with a linear network coding scheme to deliver the result of each computation, we can characterize and achieve the optimal end-to-end transmission delay. In addition, we show that using such technique, it is possible to significantly reduce the transmission delay as compared to the conventional approaches. In fact, in some scenarios, such reduction can even scale with the size of the network, where by increasing the number of nodes that can compute the given function in parallel by a multiplicative factor, the end-to-end delay will also decrease by the same factor. Moreover, we show that while finding the subset of nodes for each computation, in general, is a complex integer program, approximation algorithms can be proposed to reduce the computational complexity. In fact, for the case where the number of computing nodes for a given function is upper-bounded by a constant, a dynamic programming scheme can be proposed to find the optimum subsets in polynomial times. Our numerical simulations confirm the achieved gain in performance in comparison with conventional approaches.
In this paper, we introduce the problem of private sequential function computation, where a user wishes to compute a composition of a sequence of K linear functions, in a specific order, for an arbitrary input. The user does not run these computations locally, rather it exploits the existence of N non-colluding servers, each can compute any of the K functions on any given input. However, the user does not want to reveal any information about the desired order of computations to the servers. For this problem, we study the capacity C, defined as the supremum of the number of desired computations, normalized by the number of computations done at the servers, subject to the privacy constraint. In particular, we prove thatFor the achievability, we show that the user can retrieve the desired order of computations, by choosing a proper order of inquiries among different servers, while keeping the order of computations for each server fixed, irrespective of the desired order of computations. In the end, we develop an information-theoretic converse which results an upper bound on the capacity.
In the problems of Genome-Wide Association Study (GWAS), the objective is to associate subsequences of individuals' genomes to the observable characteristics called phenotypes. The genome containing the biological information of an individual can be represented by a sequence of length G. Many observable characteristics of individuals can be related to a subsequence of a given length L called causal subsequence. The environmental affects make the relation between the causal subsequence and the observable characteristics a stochastic function. Our objective in this paper is to detect the causal subsequence of a specific phenotype using a dataset of N individuals and their observed characteristics. We introduce an abstract formulation of GWAS which allows us to investigate the problem from an information theoretic perspective. In particular, as the parameters N, G, and L grow, we observe a threshold effect at Gh(L/G) N , where h(.) is the binary entropy function. This effect allows us to define the capacity of recovering the causal subsequence by denoting the rate of the GWAS problem as Gh(L/G) N .We develop an achievable scheme and a matching converse for this problem, and thus characterize its capacity in two scenarios: the zero-error-rate and the −error-rate.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.