We introduce a novel two-step approach for estimating a probability density function (pdf) given its samples, with the second and important step coming from a geometric formulation. The procedure involves obtaining an initial estimate of the pdf and then transforming it via a warping function to reach the final estimate. The initial estimate is intended to be computationally fast, albeit suboptimal, but its warping creates a larger, flexible class of density functions, resulting in substantially improved estimation. The search for optimal warping is accomplished by mapping diffeomorphic functions to the tangent space of a Hilbert sphere, a vector space whose elements can be expressed using an orthogonal basis. Using a truncated basis expansion, we estimate the optimal warping under a (penalized) likelihood criterion and, thus, the optimal density estimate. This framework is introduced for univariate, unconditional pdf estimation and then extended to conditional pdf estimation. The approach avoids many of the computational pitfalls associated with classical conditional-density estimation methods, without losing on estimation performance. We derive asymptotic convergence rates of the density estimator and demonstrate this approach using both synthetic datasets and real data, the latter relating to the association of a toxic metabolite on preterm birth. Key words and phrases: conditional density; density estimation; warped density; Hilbert sphere; sieve estimation; tangent space; weighted likelihood maximization S Saoudi, A Hillion, and F Ghorbel. Non-parametric probability density function estimation on a bounded support: Applications to shape classification and speech coding. Applied Stochastic models and data analysis, 10(3):215-231, 1994. S Saoudi, F Ghorbel, and A Hillion. Some statistical properties of the kernel-diffeomorphism estimator. Applied stochastic models and data analysis, 13(1):39-58, 1997. Simon J Sheather and Michael C Jones. A reliable data-based bandwidth selection method for kernel density estimation. Journal of the Royal Statistical Society. Series B (Methodological), pages 683-690, 1991. Anuj Srivastava and Eric P Klassen. Functional and shape data analysis. Springer, 2016. EG Tabak and Cristina V Turner. A family of nonparametric density estimation algorithms. Communications on Pure and Applied
The problem of nonparametrically estimating probability density functions (pdfs) from observed data requires posing and solving optimization problems on the space of pdfs. We take a geometric approach and explore this space for optimization using actions of a time-warping group. One action, termed area preserving, is transitive and is applicable to the case of unconstrained density estimation. In this case, we take a two-step approach that involves obtaining any initial estimate of the pdf and then transforming it via this warping function to reach the final estimate, while maximizing the log-likelihood function. Another action, termed modepreserving, is useful in situations where the pdf is constrained in shape, i.e. the number of its modes is known. As earlier, we initialize the estimation with an arbitrary element of the correct shape class, and then search over all time warpings to reach the optimal pdf within that shape class. Optimization over warping functions is performed numerically using the geometry of the group of warping functions. These methods are illustrated using a number of simulated examples.
Estimation of a probability density function (pdf) from its samples, while satisfying certain shape constraints, is an important problem that lacks coverage in the literature. This paper introduces a novel geometric, deformable template constrained density estimator (dtcode) for estimating pdfs constrained to have a given number of modes. Our approach explores the space of thus-constrained pdfs using the set of shape-preserving transformations: an arbitrary template from the given shape class is transformed via a shape-preserving transformation to obtain the final optimal estimate. The search for this optimal transformation, under the maximum-likelihood criterion, is performed by mapping transformations to the tangent space of a Hilbert sphere, where they are effectively linearized, and can be expressed using an orthogonal basis. This framework is first applied to (univariate) unconditional densities and then extended to conditional densities. We provide asymptotic convergence rates for dtcode, and an application of the framework to the speed distributions for different traffic flows on Californian highways. The supplementary materials for our paper can be found online.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.