Following the seminal idea of Tukey (1975), data depth is a function that measures how close an arbitrary point of the space is located to an implicitly defined center of a data cloud. Having undergone theoretical and computational developments, it is now employed in numerous applications with classification being the most popular one. The R-package ddalpha is a software directed to fuse experience of the applicant with recent achievements in the area of data depth and depth-based classification.ddalpha provides an implementation for exact and approximate computation of most reasonable and widely applied notions of data depth. These can be further used in the depth-based multivariate and functional classifiers implemented in the package, where the DDα-procedure is in the main focus. The package is expandable with user-defined custom depth methods and separators. The implemented functions for depth visualization and the built-in benchmark procedures may also serve to provide insights into the geometry of the data and the quality of pattern recognition.Being intrinsically nonparametric, a depth function captures the geometrical features of given data in an affine-invariant way. By that, it appears to be useful for description of data's location, scatter, and shape, allowing for multivariate inference, detection of outliers, ordering of multivariate distributions, and in particular classification, that recently became an important and rapidly developing application of the depth machinery. While the parameter-free nature of data depth ensures attractive theoretical properties of classifiers, its ability to reflect data topology provides promising predicting results on finite samples.
Classification in the depth spaceConsider the following setting for supervised classification: Given a training sample consisting of q classes X 1 , ..., X q , each containing n i , i = 1, ..., q, observations in R d . For a new observation x 0 , a class should be determined, to which it most probably belongs. Depth-based learning started with plug-in type classifiers. Ghosh and Chaudhuri (2005b) construct a depth-based classifier, which, in its naïve form, assigns the observation x 0 to the class in which it has maximal depth. They suggest an extension of the classifier, that is consistent w.r.t. Bayes risk for classes stemming from elliptically symmetric distributions. Further Dutta and Ghosh (2011, 2012) suggest a robust classifier and a classifier for L p -symmetric distributions, see also Cui et al. (2008), Mosler and Hoberg (2006), and additionally Jörnsten (2004) for unsupervised classification.A novel way to perform depth-based classification has been suggested by Li et al. (2012): first map a pair of training classes into a two-dimensional depth space, which is called the DD-plot, and then perform classification by selecting a polynomial that minimizes empirical risk. Finding such an optimal polynomial numerically is a very challenging and -when done appropriatelycomputationally involved task, with a solution that in practice ca...