In experimental design, we are given n vectors in d dimensions, and our goal is to select k n of them to perform expensive measurements, e.g., to obtain labels/responses, for a linear regression task. Many statistical criteria have been proposed for choosing the optimal design, with popular choices including A-and D-optimality. If prior knowledge is given, typically in the form of a d × d precision matrix A, then all of the criteria can be extended to incorporate that information via a Bayesian framework. In this paper, we demonstrate a new fundamental connection between Bayesian experimental design and determinantal point processes, the latter being widely used for sampling diverse subsets of data. We use this connection to develop new efficient algorithms for finding (1 + )-approximations of optimal designs under four optimality criteria: A, C, D and V. Our algorithms can achieve this when the desired subset size k is Ω( d A + log 1/ 2 ), where d A ≤ d is the A-effective dimension, which can often be much smaller than d. Our results offer direct improvements over a number of prior works, for both Bayesian and classical experimental design, in terms of algorithm efficiency, approximation quality, and range of applicable criteria.
It is often desirable to reduce the dimensionality of a large dataset by projecting it onto a lowdimensional subspace. Matrix sketching has emerged as a powerful technique for performing such dimensionality reduction very efficiently. Even though there is an extensive literature on the worst-case performance of sketching, existing guarantees are typically very different from what is observed in practice. We exploit recent developments in the spectral analysis of random matrices to develop novel techniques that provide provably accurate expressions for the expected value of random projection matrices obtained via sketching. These expressions can be used to characterize the performance of dimensionality reduction in a variety of common machine learning tasks, ranging from low-rank approximation to iterative stochastic optimization. Our results apply to several popular sketching methods, including Gaussian and Rademacher sketches, and they enable precise analysis of these methods in terms of spectral properties of the data. Empirical results show that the expressions we derive reflect the practical performance of these sketching methods, down to lower-order effects and even constant factors.
While fat-tailed densities commonly arise as posterior and marginal distributions in robust models and scale mixtures, they present challenges when Gaussian-based variational inference fails to capture tail decay accurately. We first improve previous theory on tails of Lipschitz flows by quantifying how the tails affect the rate of tail decay and by expanding the theory to non-Lipschitz polynomial flows. Then, we develop an alternative theory for multivariate tail parameters which is sensitive to tail-anisotropy. In doing so, we unveil a fundamental problem which plagues many existing flow-based methods: they can only model tailisotropic distributions (i.e., distributions having the same tail parameter in every direction). To mitigate this and enable modeling of tail-anisotropic targets, we propose anisotropic tail-adaptive flows (ATAF). Experimental results on both synthetic and real-world targets confirm that ATAF is competitive with prior work while also exhibiting appropriate tail-anisotropy.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.