Manifold-adaptive dimension estimation

Farahmand, Amir-massoud; Szepesvári, Csaba; Audibert, Jean-Yves

doi:10.1145/1273496.1273530

Cited by 83 publications

(73 citation statements)

References 13 publications

(13 reference statements)

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…It has been attempted to localize PCA to small neighborhoods [39,40,41,42], without much success [43], at least compared to what we may call volume-based methods [44,45,46,47,48,12,13,49,50,51,52,53,54], which we discuss at length in Section 7. These methods, roughly speaking, are based on empirical estimates of the volume of M ∩ B z (r), for z ∈ M and r > 0: such volume grows like r k when M has dimension k, and k is estimated by fitting the empirical volume estimates for different values of r. We expect such methods, at least when naively implemented, to both require a number of samples exponential in k (if O(1) samples exist in M ∩ B z (r 0 ), for some r 0 > 0, these algorithms require O(2 k ) points in M ∩ B z (2r 0 )), and to be highly sensitive to noise, which affects the density in high dimensions.…”

Section: Manifolds Local Pca and Intrinsic Dimension Estimationmentioning

confidence: 99%

“…We find an average intrinsic dimension k = 2 ( Figure 13). [67] finds k between 3 and 4 (smaller values at large scales), [68] find k ∈ [3.65, 4.65], [51] find an intrinsic dimension k = 3 using either Takens, Grassberger Procaccia or the Smoothed Grassberger Procaccia estimators, [69] find k = 4 and k = 3 depending on the way the point-wise estimates are combined (average and voting, respectively), and finally [44] find k = 4.3. Finally, we consider some data-sets whose intrinsic dimension has not been previously analyzed.…”

Section: Real Data Setsmentioning

confidence: 99%

See 1 more Smart Citation

Multiscale geometric methods for data sets I: Multiscale SVD, noise and curvature

Little

Maggioni

Rosasco

2017

Applied and Computational Harmonic Analysis

View full text Add to dashboard Cite

Large data sets are often modeled as being noisy samples from probability distributions µ in R D , with D large. It has been noticed that oftentimes the support M of these probability distributions seems to be well-approximated by low-dimensional sets, perhaps even by manifolds. We shall consider sets that are locally well approximated by k-dimensional planes, with k ≪ D, with k-dimensional manifolds isometrically embedded in R D being a special case. Samples from µ are furthermore corrupted by D-dimensional noise. Certain tools from multiscale geometric measure theory and harmonic analysis seem well-suited to be adapted to the study of samples from such probability distributions, in order to yield quantitative geometric information about them. In this paper we introduce and study multiscale covariance matrices, i.e. covariances corresponding to the distribution restricted to a ball of radius r, with a fixed center and varying r, and under rather general geometric assumptions we study how their empirical, noisy counterparts behave. We prove that in the range of scales where these covariance matrices are most informative, the empirical, noisy covariances are close to their expected, noiseless counterparts. In fact, this is true as soon as the number of samples in the balls where the covariance matrices are computed is linear in the intrinsic dimension of M. As an application, we present an algorithm for estimating the intrinsic dimension of M.

show abstract

Section: Manifolds Local Pca and Intrinsic Dimension Estimationmentioning

confidence: 99%

Section: Real Data Setsmentioning

confidence: 99%

Multiscale geometric methods for data sets I: Multiscale SVD, noise and curvature

Little

Maggioni

Rosasco

2017

Applied and Computational Harmonic Analysis

View full text Add to dashboard Cite

show abstract

“…where D m can be estimated in advance from the dataset by other algorithms [52][53][54][55][56][57]. Based on Equation (25), we argue that for effectiveness of NCSC or even other classifiers, the number of training samples of a class should not be less than the intrinsic manifold dimension D m .…”

Section: Intrinsic Dimensionmentioning

confidence: 99%

“…These estimators can be broadly divided into two categories: eigen projection 1 More specifically, in Equations (23) methods [58,59] and geometric methods [52][53][54][55][56][57]. Eigen projection methods estimate intrinsic dimension from the eigen decomposition of the covariance matrix of the give data.…”

Section: Intrinsic Dimension Estimatormentioning

confidence: 99%

“…Their estimates are given as the number of eigenvalues not less than a predefined threshold. Geometric methods, including Corr.Dim (Correlation Dimension) [53,54], MLE (Maximum Likelihood Estimate) [52] and their variations [55][56][57], exploit the intrinsic geometry of the dataset and are more sophisticated than their eigen projection counterparts [52].…”

Section: Intrinsic Dimension Estimatormentioning

confidence: 99%

See 1 more Smart Citation

Image recognition via two-dimensional random projection and nearest constrained subspace

Liao

Zhang

Maybank

et al. 2014

Journal of Visual Communication and Image Representation

View full text Add to dashboard Cite

We consider the problem of image recognition via two-dimensional random projection and nearest constrained subspace. First, image features are extracted by a two-dimensional random projection. The two-dimensional random projection for feature extraction is an extension of the 1D compressive sampling technique to 2D and is computationally more efficient than its 1D counterpart and 2D reconstruction is guaranteed. Second, we design a new classifier called NCSC (Nearest Constrained Subspace Classifier) and apply it to image recognition with the 2D features. The proposed classifier is a generalized version of NN (Nearest Neighbor) and NFL (Nearest Feature Line), and it has a close relationship to NS (Nearest Subspace). For large datasets, a fast NCSC, called NCSC-II, is proposed. Experiments on several publicly available image sets show that when well-tuned, NCSC/NCSC-II outperforms its rivals including NN, NFL, NS and the orthonormal 2 -norm classifier. NCSC/NCSC-II with the 2D random features also shows good classification performance in noisy environment.

show abstract

Locality preserving based data regression and its application for soft sensor modelling

Miao

2016

Can J Chem Eng

View full text Add to dashboard Cite

A new local‐based data regression technique named locality preserving regression (LPR) is developed and applied for soft sensor modelling in the present study. By taking the local variation obtained by locality preserving projections into consideration, the regression algorithm LPR is employed to construct a soft sensor model and applied to industrial case. Furthermore, to deal with the time‐varying behaviour of the process variables, just‐in‐time learning is also integrated to regularly update the soft sensor. Two case studies on a fermentation process for penicillin concentration prediction and the Tennessee Eastman process for output component prediction are provided to demonstrate the performance of the proposed method. Finally, the effectiveness and robustness of the proposed local‐based technique for soft sensor modelling are assessed and compared with the global‐based soft sensors based on the mean square error and the coefficient of determination. Experimental results showed that the novel soft sensor model could estimate the output with higher accuracy and generalization ability than the general soft sensor based on the global information.

show abstract

Manifold-adaptive dimension estimation

Cited by 83 publications

References 13 publications

Multiscale geometric methods for data sets I: Multiscale SVD, noise and curvature

Multiscale geometric methods for data sets I: Multiscale SVD, noise and curvature

Image recognition via two-dimensional random projection and nearest constrained subspace

Locality preserving based data regression and its application for soft sensor modelling

Contact Info

Product

Resources

About