Abstract. We propose a convergence analysis of accelerated forward-backward splitting methods for composite function minimization, when the proximity operator is not available in closed form, and can only be computed up to a certain precision. We prove that the 1/k 2 convergence rate for the function values can be achieved if the admissible errors are of a certain type and satisfy a sufficiently fast decay condition. Our analysis is based on the machinery of estimate sequences first introduced by Nesterov for the study of accelerated gradient descent algorithms. Furthermore, we give a global complexity analysis, taking into account the cost of computing admissible approximations of the proximal point. An experimental analysis is also presented.
The problem of recovering a structured signal x ∈ C p from a set of dimensionality-reduced linear measurements b = Ax arises in a variety of applications, such as medical imaging, spectroscopy, Fourier optics, and computerized tomography. Due to computational and storage complexity or physical constraints imposed by the problem, the measurement matrix A ∈ C n×p is often of the form A = PΩΨ for some orthonormal basis matrix Ψ ∈ C p×p and subsampling operator PΩ : C p → C n that selects the rows indexed by Ω. This raises the fundamental question of how best to choose the index set Ω in order to optimize the recovery performance. Previous approaches to addressing this question rely on non-uniform random subsampling using application-specific knowledge of the structure of x. In this paper, we instead take a principled learning-based approach in which a fixed index set is chosen based on a set of training signals x1, . . . , xm. We formulate combinatorial optimization problems seeking to maximize the energy captured in these signals in an average-case or worst-case sense, and we show that these can be efficiently solved either exactly or approximately via the identification of modularity and submodularity structures. We provide both deterministic and statistical theoretical guarantees showing how the resulting measurement matrices perform on signals differing from the training signals, and we provide numerical examples showing our approach to be effective on a variety of data sets.
Abstract-Structured sparsity methods have been recently proposed that allow to incorporate additional spatial and temporal information for estimating models for decoding mental states from fMRI data. These methods carry the promise of being more interpretable than simpler Lasso or Elastic Net methods. However, despite sparsity has often been advocated as leading to more interpretable models, we show that by itself sparsity and also structured sparsity could lead to unstable models.We present an extension of the Total Variation method and assess several other structured sparsity models on accuracy, sparsity and stability. Our results indicate that structured sparsity via the Sparse Total Variation can mitigate some of the instability inherent in simpler sparse methods, but more research is required to build methods that can reliably infer relevant activation patterns from fMRI data.
Abstract-Group-based sparsity models are instrumental in linear and non-linear regression problems. The main premise of these models is the recovery of "interpretable" signals through the identification of their constituent groups, which can also provably translate in substantial savings in the number of measurements for linear models in compressive sensing. In this paper, we establish a combinatorial framework for group-model selection problems and highlight the underlying tractability issues. In particular, we show that the group-model selection problem is equivalent to the well-known NP-hard weighted maximum coverage problem. Leveraging a graph-based understanding of group models, we describe group structures that enable correct model selection in polynomial time via dynamic programming. Furthermore, we show that popular group structures can be explained by linear inequalities involving totally unimodular matrices, which afford other polynomial time algorithms based on relaxations. We also present a generalization of the group model that allows for within group sparsity, which can be used to model hierarchical sparsity. Finally, we study the Pareto frontier between approximation error and sparsity budget of group-sparse approximations for two tractable models, among which the tree sparsity model, and illustrate selection and computation tradeoffs between our framework and the existing convex relaxations.
Source separation or demixing is the process of extracting multiple components entangled within a signal. Contemporary signal processing presents a host of difficult source separation problems, from interference cancellation to background subtraction, blind deconvolution, and even dictionary learning. Despite the recent progress in each of these applications, advances in high-throughput sensor technology place demixing algorithms under pressure to accommodate extremely high-dimensional signals, separate an ever larger number of sources, and cope with more sophisticated signal and mixing models. These difficulties are exacerbated by the need for real-time action in automated decision-making systems.Recent advances in convex optimization provide a simple framework for efficiently solving numerous difficult demixing problems. This article provides an overview of the emerging field, explains the theory that governs the underlying procedures, and surveys algorithms that solve them efficiently. We aim to equip practitioners with a toolkit for constructing their own demixing algorithms that work, as well as concrete intuition for why they work. Fundamentals of demixingThe most basic model for mixed signals is a superposition model, where we observe a mixed signal z 0 ∈ R d of the formand we wish to determine the component signals x 0 and y 0 . This simple model appears in many guises. Sometimes, superimposed signals come from basic laws of nature. The amplitudes of electromagnetic waves, for example, sum together at a receiver, making the superposition model (1) common in wireless communications. Similarly, the additivity of sound waves makes superposition models natural in speech and audio processing.Other times, a superposition provides a useful, if not literally true, model for more complicated nonlinear phenomena. Images, for example, can be modeled as the sum of constituent featuresthink of stars and galaxies that sum to create an image of a piece of the night sky . In machine learning, superpositions can describe hidden structure , while in statistics, superpositions can model gross corruptions to data . These models also appear in texture repair , graph clustering , and line-spectral estimation .A conceptual understanding of demixing in all of these applications rests on two key ideas. Low-dimensional structures: Natural signals in high dimensions often cluster around lowdimensional structures with few degrees of freedom relative to the ambient dimension .Examples include bandlimited signals, array observations from seismic sources, and natural
Abstract-Recently, machine learning models have been applied to neuroimaging data, allowing to make predictions about a variable of interest based on the pattern of activation or anatomy over a set of voxels. These pattern recognition based methods present undeniable assets over classical (univariate) techniques, by providing predictions for unseen data, as well as the weights of each voxel in the model. However, the obtained weight map cannot be thresholded to perform regionally specific inference, leading to a difficult localization of the variable of interest. In this work, we provide local averages of the weights according to regions defined by anatomical or functional atlases (e.g. Brodmann atlas). These averages can then be ranked, thereby providing a sorted list of regions that can be (to a certain extent) compared with univariate results. Furthermore, we defined a "ranking distance", allowing for the quantitative comparison between localized patterns. These concepts are illustrated with two datasets.
In this paper we study a class of regularized kernel methods for vector-valued learning which are based on filtering the spectrum of the kernel matrix. The considered methods include Tikhonov regularization as a special case, as well as interesting alternatives such as vector-valued extensions of L2 boosting. Computational properties are discussed for various examples of kernels for vector-valued functions and the benefits of iterative techniques are illustrated. Generalizing previous results for the scalar case, we show finite sample bounds for the excess risk of the obtained estimator and, in turn, these results allow to prove consistency both for regression and multicategory classification. Finally, we present some promising results of the proposed algorithms on artificial and real data.
Linear sketching and recovery of sparse vectors with randomly constructed sparse matrices has numerous applications in several areas, including compressive sensing, data stream computing, graph sketching, and combinatorial group testing. This paper considers the same problem with the added twist that the sparse coefficients of the unknown vector exhibit further correlations as determined by a known sparsity model. We prove that exploiting model-based sparsity in recovery provably reduces the sketch size without sacrificing recovery quality. In this context, we present the model-expander iterative hard thresholding algorithm for recovering model sparse signals from linear sketches obtained via sparse adjacency matrices of expander graphs with rigorous performance guarantees. The main computational cost of our algorithm depends on the difficulty of projecting onto the model-sparse set. For the tree and group-based sparsity models we describe in this paper, such projections can be obtained in linear time. Finally, we provide numerical experiments to illustrate the theoretical results in action.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.