Vittorio Erba scite author profile

Identifying the minimal number of parameters needed to describe a dataset is a challenging problem known in the literature as intrinsic dimension estimation. All the existing intrinsic dimension estimators are not reliable whenever the dataset is locally undersampled, and this is at the core of the so called curse of dimensionality. Here we introduce a new intrinsic dimension estimator that leverages on simple properties of the tangent space of a manifold and extends the usual correlation integral estimator to alleviate the extreme undersampling problem. Based on this insight, we explore a multiscale generalization of the algorithm that is capable of (i) identifying multiple dimensionalities in a dataset, and (ii) providing accurate estimates of the intrinsic dimension of extremely curved manifolds. We test the method on manifolds generated from global transformations of high-contrast images, relevant for invariant object recognition and considered a challenge for state-of-the-art intrinsic dimension estimators.

show abstract

The Dyck bound in the concave 1-dimensional random assignment model

Caracciolo¹,

D'Achille²,

Erba³

et al. 2020

J. Phys. A: Math. Theor.

View full text Add to dashboard Cite

We consider models of assignment for random N blue points and N red points on an interval of length 2N , in which the cost for connecting a blue point in x to a red point in y is the concave function |x − y| p , for 0 < p < 1. Contrarily to the convex case p > 1, where the optimal matching is trivially determined, here the optimization is non-trivial.The purpose of this paper is to introduce a special configuration, that we call the Dyck matching, and to study its statistical properties. We compute exactly the average cost, in the asymptotic limit of large N , together with the first subleading correction. The scaling is remarkable: it is of order N for p < 1 2 , order N ln N for p = 1 2 , and N arXiv:1904.10867v2 [cond-mat.dis-nn]In this paper we study the statistical properties of the Euclidean random assignment problem, in the case in which the points are confined to a one-dimensional interval, and the cost is a concave increasing function of their distance. The assignment problem is a combinatorial optimization problem, a special case of the matching problem when the underlying graph is bipartite. As for any combinatorial optimization problem, each realisation of the problem is described by an instance J, and the goal is to find, within some space of configurations, the particular one that minimises the given cost function. In the assignment problem, the instance J is a real-positive N × N matrix, encoding the costs of each possible pairing among N blue and N red points (J ij is the cost of pairing the i-th blue point with the j-th red point), the space of configurations is the set of permutations of N objects, π ∈ S N (describing a complete assignment of blue and red points), and the cost function isA random assignment problem is the datum of a probability measure µ N (J) on the possible instances of the problem. The interest is in the determination of the statistical properties of this problem w.r.t. the measure under analysis, and in particular the statistical properties of the optimal configuration. The problem can be formulated as the zero-temperature limit of the statistical mechanics properties of a disordered system, where the disorder is the instance J, the dynamical variables are encoded by π, and the Hamiltonian is the cost function H J (π).The case of random assignment problem in which the entries J ij are random i.i.d. variables, presented already in [1], has been solved at first, in a seminal paper by Parisi and Mézard [2], through the replica trick and afterwards by the Cavity Equations [3] (see also [4] for a recent generalization of those results also at finite system size). The Parisi-Mézard solution also leads to the striking prediction that, calling π opt (J) the optimal matching for the instance J and H opt (J) = H J (π opt (J)) its optimal cost, the average over all instances of H opt , for N large, tends to π 2 6 . This problem is simpler than a spin glass, as the determination of the optimal configuration is feasible in polynomial time (for example, through the celebrated Hungarian Alg...

show abstract

Random geometric graphs in high dimension

2020

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Vittorio Erba

Statistical learning theory of structured data

Intrinsic dimension estimation for locally undersampled data

The Dyck bound in the concave 1-dimensional random assignment model

Random geometric graphs in high dimension

Contact Info

Product

Resources

About