“…Information divergences are integral functionals of two probability distributions and have many applications in the fields of information theory, statistics, signal processing, and machine learning. Some applications of divergences include estimating the decay rates of error probabilities [ 1 ], estimating bounds on the Bayes error [ 2 , 3 , 4 , 5 , 6 , 7 , 8 ] or the minimax error [ 9 ] for a classification problem, extending machine learning algorithms to distributional features [ 10 , 11 , 12 , 13 ], testing the hypothesis that two sets of samples come from the same probability distribution [ 14 ], clustering [ 15 , 16 , 17 ], feature selection and classification [ 18 , 19 , 20 ], blind source separation [ 21 , 22 ], image segmentation [ 23 , 24 , 25 ], and steganography [ 26 ]. For many more applications of divergence measures, see reference [ 27 ].…”