Abstract-This paper develops a new understanding of mean shift algorithms from an information theoretic perspective. We show that the Gaussian Blurring Mean Shift (GBMS) directly minimizes the Renyi's quadratic entropy of the dataset and hence is unstable by definition. Further, its stable counterpart, the Gaussian Mean Shift (GMS), minimizes the Renyi's "cross" entropy where the local stationary solutions are modes of the dataset. By doing so, we aptly answer the question "What does mean shift algorithms optimize?", thus highlighting naturally the properties of these algorithms. A consequence of this new understanding is the superior performance of GMS over GBMS which we show in a wide variety of applications ranging from mode finding to clustering and image segmentation.Index Terms-Mean shift, information theoretic learning, Renyi's entropy.
Linear Prediction schemes make a predictionx i of a data sample x i using p previous samples. It has been shown 1, 3] that as the order of prediction p ! 1, there is no gain to be obtained by coding subband samples. This
In this paper we introduce a new cost function called Information Theoretic Mean Shift algorithm to capture the "predominant structure" in the data. We formulate this problem with a cost function which minimizes the entropy of the data subject to the constraint that the Cauchy-Schwartz distance between the new and the original dataset is fixed to some constant value. We show that Gaussian Mean Shift and the Gaussian Blurring Mean Shift are special cases of this generalized algorithm giving a whole new perspective to the idea of mean shift. Further this algorithm can also be used to capture the principal curve of the data making it ubiquitous for manifold learning.
Abstract. In this paper, we propose a fast and accurate approximation to the information potential of Information Theoretic Learning (ITL) using the Fast Gauss Transform (FGT). We exemplify here the case of the Minimum Error Entropy criterion to train adaptive systems. The FGT reduces the complexity of the estimation from O(N 2 ) to O(pkN) where p is the order of the Hermite approximation and k the number of clusters utilized in FGT. Further, we show that FGT converges to the actual entropy value rapidly with increasing order p unlike the Stochastic Information Gradient, the present O(pN) approximation to reduce the computational complexity in ITL. We test the performance of these FGT methods on System Identification with encouraging results.
Previous studies have shown that embedding local search in classical evolutionary programming (EP) could lead to improved performance on function optimization problems. In this paper, the utility of local search is investigated with fast evolutionary programming (FEP) and comparisons are offered between performance improvements obtained when using local search with Gaussian and Cauchy mutations. Experiments were conducted on a suite of four well known function optimization problems using two local search methods (conjugate gradient and Solis and Wets) with varying amounts of locan search being incorporated into the evolutionary algorithm. Empirical results indicate that FEB with the conjugate gradient method outperforms other hybrid methods on three of the four functions when evolution was conducted for a fixed number of generations. Trials using local search produced solutions that were statistically as good as or better than trials without local search. However, the cost of using local search justified the enhancement in solution quality only when using Gaussian mutations but not when using Cauchy mutations.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.