“…In particular, the family of Bregman divergences, which includes the widely used Kullback-Leibler divergence, can offer numerous advantages in learning applications compared to the Euclidean distance alone [34]. Notably, in the case of deterministic annealing, Bregman divergences play an even more important role, since we can show that, if d is a Bregman divergence, the solution to the second optimization step ( 14) can be analytically computed in a convenient centroid form [30]:…”