Data which lie in the space Pm , of m × m symmetric positive definite matrices, (sometimes called tensor data), play a fundamental role in applications including medical imaging, computer vision, and radar signal processing. An open challenge, for these applications, is to find a class of probability distributions, which is able to capture the statistical properties of data in Pm , as they arise in real-world situations. The present paper meets this challenge by introducing Riemannian Gaussian distributions on Pm . Distributions of this kind were first considered by Pennec in 2006. However, the present paper gives an exact expression of their probability density function for the first time in existing literature. This leads to two original contributions. First, a detailed study of statistical inference for Riemannian Gaussian distributions, uncovering the connection between maximum likelihood estimation and the concept of Riemannian centre of mass, widely used in applications. Second, the derivation and implementation of an expectation-maximisation algorithm, for the estimation of mixtures of Riemannian Gaussian distributions. The paper applies this new algorithm, to the classification of data in Pm , (concretely, to the problem of texture classification, in computer vision), showing that it yields significantly better performance, in comparison to recent approaches.
Index TermsSymmetric positive definite matrices, tensor, Riemannian metric, Gaussian distribution, expectation-maximisation, texture where, again, d : P m × P m → R + is Rao's Riemannian distance. SinceŶ N minimises the sum of squares of distances to the points Y 1 , . . . , Y N , it is widely viewed as a representative, average, or mode of these points.Distributions of the form (1) were considered by Pennec, who defined them on general Riemannian manifolds [20]. However, in existing literature, their treatment remains incomplete, as it is based on asymptotic formulae, valid only in the limit where the parameter σ is small, see [20]- [22]. In addition to being inexact, such formulae are quite difficult, both to evaluate and to apply. These issues, (lack of an exact expression and difficulty of application), are overcome in the following.