A pragmatic approach for entropy estimation is presented. It is based on coincidence detection and its application leads to a simple algorithm with two attractive features: the method can be used without any a priori knowledge concerning the source alphabet cardinality (K ) and it can provide useful, though biased, estimates even when the number of samples is less than K.
Abstract-A pragmatic approach for entropy estimation is presented, first for discrete variables, then in the form of an extension for handling continuous and/or multivariate ones. It is based on coincidence detection, and its application leads to algorithms with three main attractive features: they are easy to use, can be employed without any a priori knowledge concerning source distribution (not even the alphabet cardinality K of discrete sources) and can provide useful estimates even when the number of samples, N , is less than K, for discrete variables, whereas plug-in methods typically demand N >> K for a proper approximation of probability mass functions. Experiments done with both discrete and continuous random variables illustrate the simplicity of use of the proposed method, whereas numerical comparisons to other methods show that, in spite of its simplicity, useful results are yielded.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.