Abstract. The process of representing a large data set with a smaller number of vectors in the best possible way, also known as vector quantization, has been intensively studied in the recent years. Very efficient algorithms like the Kohonen self-organizing map (SOM) and the Linde Buzo Gray (LBG) algorithm have been devised. In this paper a physical approach to the problem is taken, and it is shown that by considering the processing elements as points moving in a potential field an algorithm equally efficient as the before mentioned can be derived. Unlike SOM and LBG this algorithm has a clear physical interpretation and relies on minimization of a well defined cost function. It is also shown how the potential field approach can be linked to information theory by use of the Parzen density estimator. In the light of information theory it becomes clear that minimizing the free energy of the system is in fact equivalent to minimizing a divergence measure between the distribution of the data and the distribution of the processing elements, hence, the algorithm can be seen as a density matching method.
Principal components analysis is an important and well-studied subject in statistics and signal processing. The literature has an abundance of algorithms for solving this problem, where most of these algorithms could be grouped into one of the following three approaches: adaptation based on Hebbian updates and deflation, optimization of a second-order statistical criterion (like reconstruction error or output variance), and fixed point update rules with deflation. In this paper, we take a completely different approach that avoids deflation and the optimization of a cost function using gradients. The proposed method updates the eigenvector and eigenvalue matrices simultaneously with every new sample such that the estimates approximately track their true values as would be calculated from the current sample estimate of the data covariance matrix. The performance of this algorithm is compared with that of traditional methods like Sanger's rule and APEX, as well as a structurally similar matrix perturbation-based method.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.