V. Granville scite author profile

SUMMARY A new theoretical point of view is discussed in the framework of density estimation. The multivariate true density, viewed as a prior or penalizing factor in a Bayesian framework, is modelled by a Gibbs potential. Estimating the density consists in maximizing the posterior. For efficiency of time, we are interested in an approximate estimator f̂ = Bπ of the true density f, where B is a stochastic operator and π is the raw histogram. Then, we investigate the discrimination problem, introducing an adaptive bandwidth depending on the k nearest neighbours and chosen to optimize the cross‐validation criterion. Our final classification algorithm referred to as APML for approximate penalized maximum likelihood compares favourably in terms of error rate and time efficiency with other algorithms tested, including multinormal, nearest neighbour and convex hull classifiers.

show abstract

Estimation of the intensity of a Poisson point process by means of nearest neighbor distances

Granville

1998

Statistica Neerlandica

View full text Add to dashboard Cite

show abstract

A strange recursive relation

Granville¹,

Rasson²

1988

Journal of Number Theory

View full text Add to dashboard Cite

Optimal Clustering and Cluster Identity in Understanding High-Dimensional Data Spaces with Tightly Distributed Points

Chikumbo¹,

Granville²

2019

MAKE

View full text Add to dashboard Cite

The sensitivity of the elbow rule in determining an optimal number of clusters in high-dimensional spaces that are characterized by tightly distributed data points is demonstrated. The high-dimensional data samples are not artificially generated, but they are taken from a real world evolutionary many-objective optimization. They comprise of Pareto fronts from the last 10 generations of an evolutionary optimization computation with 14 objective functions. The choice for analyzing Pareto fronts is strategic, as it is squarely intended to benefit the user who only needs one solution to implement from the Pareto set, and therefore a systematic means of reducing the cardinality of solutions is imperative. As such, clustering the data and identifying the cluster from which to pick the desired solution is covered in this manuscript, highlighting the implementation of the elbow rule and the use of hyper-radial distances for cluster identity. The Calinski-Harabasz statistic was favored for determining the criteria used in the elbow rule because of its robustness. The statistic takes into account the variance within clusters and also the variance between the clusters. This exercise also opened an opportunity to revisit the justification of using the highest Calinski-Harabasz criterion for determining the optimal number of clusters for multivariate data. The elbow rule predicted the maximum end of the optimal number of clusters, and the highest Calinski-Harabasz criterion method favored the number of clusters at the lower end. Both results are used in a unique way for understanding high-dimensional data, despite being inconclusive regarding which of the two methods determine the true optimal number of clusters.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

V. Granville

Simulated annealing: a proof of convergence

Multivariate Discriminant Analysis and Maximum Penalized Likelihood Density Estimation

Estimation of the intensity of a Poisson point process by means of nearest neighbor distances

A strange recursive relation

Optimal Clustering and Cluster Identity in Understanding High-Dimensional Data Spaces with Tightly Distributed Points

Contact Info

Product

Resources

About