The $$\hbox {DD}^G$$ DD G -classifier in the functional setting

Cuesta-Albertos, Juan A.; Febrero–Bande, Manuel; Fuente, Manuel Oviedo de la

doi:10.1007/s11749-016-0502-6

Cited by 37 publications

(34 citation statements)

References 37 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Potentially, this lacks quantitative flexibility because of the finite set of existing components. Nevertheless, in many cases this solution provides satisfactory results; see a comprehensive discussion by Cuesta-Albertos et al (2016) with experimental comparisons involving a number of functional depth notions and q-dimensional classifiers, as well as their implementation in the R-package fda.usc. Corresponding functional depth procedures can also be used with R-package ddalpha, see Section 6 for a detailed explanation.…”

Section: An Extension To Functional Datamentioning

confidence: 99%

Depth and Depth-Based Classification with R Package ddalpha

Pokotylo¹,

Mozharovskyi²,

Dyckerhoff³

2019

J. Stat. Soft.

View full text Add to dashboard Cite

Following the seminal idea of Tukey (1975), data depth is a function that measures how close an arbitrary point of the space is located to an implicitly defined center of a data cloud. Having undergone theoretical and computational developments, it is now employed in numerous applications with classification being the most popular one. The R-package ddalpha is a software directed to fuse experience of the applicant with recent achievements in the area of data depth and depth-based classification.ddalpha provides an implementation for exact and approximate computation of most reasonable and widely applied notions of data depth. These can be further used in the depth-based multivariate and functional classifiers implemented in the package, where the DDα-procedure is in the main focus. The package is expandable with user-defined custom depth methods and separators. The implemented functions for depth visualization and the built-in benchmark procedures may also serve to provide insights into the geometry of the data and the quality of pattern recognition.Being intrinsically nonparametric, a depth function captures the geometrical features of given data in an affine-invariant way. By that, it appears to be useful for description of data's location, scatter, and shape, allowing for multivariate inference, detection of outliers, ordering of multivariate distributions, and in particular classification, that recently became an important and rapidly developing application of the depth machinery. While the parameter-free nature of data depth ensures attractive theoretical properties of classifiers, its ability to reflect data topology provides promising predicting results on finite samples. Classification in the depth spaceConsider the following setting for supervised classification: Given a training sample consisting of q classes X 1 , ..., X q , each containing n i , i = 1, ..., q, observations in R d . For a new observation x 0 , a class should be determined, to which it most probably belongs. Depth-based learning started with plug-in type classifiers. Ghosh and Chaudhuri (2005b) construct a depth-based classifier, which, in its naïve form, assigns the observation x 0 to the class in which it has maximal depth. They suggest an extension of the classifier, that is consistent w.r.t. Bayes risk for classes stemming from elliptically symmetric distributions. Further Dutta and Ghosh (2011, 2012) suggest a robust classifier and a classifier for L p -symmetric distributions, see also Cui et al. (2008), Mosler and Hoberg (2006), and additionally Jörnsten (2004) for unsupervised classification.A novel way to perform depth-based classification has been suggested by Li et al. (2012): first map a pair of training classes into a two-dimensional depth space, which is called the DD-plot, and then perform classification by selecting a polynomial that minimizes empirical risk. Finding such an optimal polynomial numerically is a very challenging and -when done appropriatelycomputationally involved task, with a solution that in practice ca...

show abstract

Section: An Extension To Functional Datamentioning

confidence: 99%

Depth and Depth-Based Classification with R Package ddalpha

Pokotylo¹,

Mozharovskyi²,

Dyckerhoff³

2019

J. Stat. Soft.

View full text Add to dashboard Cite

show abstract

“…(F) We were interested in estimating the shape of the inverse coefficient of variation, shown in this figure. The selected time points are shown as vertical bars in D-F. (G) We used half the boy curves and half the girl curves to select time points to sample from, and to train a DD-classifier [Cuesta-Albertos et al, 2016, Li et al, 2012, and then calculated the percent accuracy on the other half of the boy and girl curves. This procedure was repeated 30 times with each method of selecting time points (selecting all the time points, 5 time points with NITPicker, 5 time points randomly, and 5 time points evenly).…”

Section: Nitpicker Algorithmmentioning

confidence: 99%

“…The point of this exercise is to select time points that help us estimate the shape of the difference between girl and boy curves; however, as by-product of the procedure we might hope that we can select time points that are reasonable at predicting whether an individual growth curve comes from a boy or a girl. Similar to our analysis for the Canada dataset, we split the curves into training and testing sets, but this time we not only select a set of time points using the training set, but we also train a classifier commonly used to classify functional data [Cuesta-Albertos et al, 2016, Li et al, 2012. Although, as expected, the best classifier used all the time points, NITPicker-selected time points could be used to develop a more accurate classifier than selecting time points either evenly or randomly ( Figure 5G).…”

Section: Testing Nitpicker On Real World Datamentioning

confidence: 99%

Selection of time points for costly experiments: a comparison between human intuition and computer-aided experimental design

Ezer

Keir

2018

Preprint

View full text Add to dashboard Cite

Motivation:The design of an experiment influences both what a researcher can measure, as well as how much confidence can be placed in the results. As such, it is vitally important that experimental design decisions do not systematically bias research outcomes. At the same time, making optimal design decisions can produce results leading to statistically stronger conclusions. Deciding where and when to sample are among the most critical aspects of many experimental designs; for example, we might have to choose the time points at which to measure some quantity in a time series experiment. Choosing times which are too far apart could result in missing short bursts of activity. On the other hand, there may be time points which provide very little information regarding the overall behaviour of the quantity in question. Results: In this study, we design a survey to analyse how biologists use previous research outcomes to inform their decisions about which time points to sample in subsequent experiments. We then determine how the choice of time points affects the type of perturbations in gene expression that can be observed. Finally, we present our main result: NITPicker, a computational strategy for selecting optimal time points (or spatial points along a single axis), that eliminates some of the biases caused by human decision-making while maximising information about the shape of the underlying curves, utilising ideas from the field of functional data analysis.

show abstract

“…m is a distance. The h-depth is used in Cuesta-Albertos et al (2016), among several genuine depth approaches, in a generalized DD-plot to classify functional data. However, the DD G classifier with h-depth applies equal spherical kernels to both classes, with the same parameter h. The authors also do not discuss about the selection of h, while Cuevas et al (2007) proposed keeping it constant for the functional setup.…”

Section: Pot-pot Plot Classificationmentioning

confidence: 99%

Classification with the pot–pot plot

Pokotylo

Mosler

2016

Stat Papers

View full text Add to dashboard Cite

We propose a procedure for supervised classification that is based on potential functions. The potential of a class is defined as a kernel density estimate multiplied by the class's prior probability. The method transforms the data to a potential-potential (pot-pot) plot, where each data point is mapped to a vector of potentials. Separation of the classes, as well as classification of new data points, is performed on this plot. For this, either the α-procedure (α-P) or knearest neighbors (k-NN) are employed. For data that are generated from continuous distributions, these classifiers prove to be strongly Bayes-consistent. The potentials depend on the kernel and its bandwidth used in the density estimate. We investigate several variants of bandwidth selection, including joint and separate pre-scaling and a bandwidth regression approach. The new method is applied to benchmark data from the literature, including simulated data sets as well as 50 sets of real data. It compares favorably to known classification methods such as LDA, QDA, max kernel density estimates, k-NN, and DD-plot classification using depth functions.

show abstract

The $$\hbox {DD}^G$$ DD G -classifier in the functional setting

Cited by 37 publications

References 37 publications

Depth and Depth-Based Classification with R Package ddalpha

Depth and Depth-Based Classification with R Package ddalpha

Selection of time points for costly experiments: a comparison between human intuition and computer-aided experimental design

Classification with the pot–pot plot

Contact Info

Product

Resources

About