“…Recently, the interest on the Wasserstein distance [36] and the associated Optimal Transport theory [37] have been growing due to their successful application in many domains (e.g., imaging, signal processing and analysis, natural language process/generation, human learning, optimization, etc.) [38,39,40,41,42,43,44]. Very briefly, the Wasserstein distance allows to measure the difference between two probability distributions, α and α , independently on the values and the nature of their supports (they can be both discrete, continuous, or one continuous and one discrete).…”