Sketching Datasets for Large-Scale Learning (long version)

Gribonval, Rémi; Chatalic, Antoine; Keriven, Nicolas; Schellekens, Vincent; Jacques, Laurent; Schniter, Philip

doi:10.48550/arxiv.2008.01839

Cited by 2 publications

(3 citation statements)

References 51 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The excess risk of the GMM learning task is then controlled by the sum of an empirical error term and a modeling error term. This guarantees that the estimated GMM approximates well the distribution of the data [19].…”

Section: Recovery Guaranteesmentioning

confidence: 70%

See 1 more Smart Citation

Compressive Learning for Patch-Based Image Denoising

Shi¹,

Traonmilin²,

Aujol³

2022

SIAM J. Imaging Sci.

View full text Add to dashboard Cite

The Expected Patch Log-Likelihood algorithm (EPLL) and its extensions have shown good performances for image denoising. The prior model used by EPLL is usually a Gaussian Mixture Model (GMM) estimated from a database of image patches. Classical mixture model estimation methods face computational issues as the high dimensionality of the problem requires training on large datasets. In this work, we adapt a compressive statistical learning framework to carry out the GMM estimation. With this method, called sketching, we estimate models from a compressive representation (the sketch) of the training patches. The cost of estimating the prior from the sketch no longer depends on the number of items in the original large database. To accelerate further the estimation, we add another dimension reduction technique (low-rank modeling of the covariance matrices) to the compressing learning framework. To demonstrate the advantages of our method, we test it on real large-scale data. We show that we can produce denoising performances similar to performances obtained with models estimated from the original training database using GMM priors learned from the sketch with improved execution times.

show abstract

Section: Recovery Guaranteesmentioning

confidence: 70%

“…Leveraging ideas from compressive sensing [15] and streaming algorithms [9], R. Gribonval et al propose a sketching method [25,19,20,18,17] to compress the training database.…”

mentioning

confidence: 99%

Compressive Learning for Patch-Based Image Denoising

Shi¹,

Traonmilin²,

Aujol³

2022

SIAM J. Imaging Sci.

View full text Add to dashboard Cite

show abstract

“…The weighted variants of the variational (Feldman, Faulkner, and Krause 2011;Zhang et al 2016;Campbell andBeronov 2019) andsampling-based (McGrory et al 2014) methods then process the coresets. Reducing D relies on the compression of data into smaller representations via random projections (Siblini, Kuntz, and Meyer 2019;Ayesha, Hanif, and Talib 2020), which is achieved in two ways: (i) each data item is projected into an individual representation (Dasgupta 1999); (ii) all data items are projected into an overall representation, commonly referred to as sketch (Keriven et al 2018;Gribonval et al 2020).…”

Section: More Remarks On Related Workmentioning

confidence: 99%

Fitting large mixture models using stochastic component selection

Papež¹,

Pevný²,

Šmı́dl³

2021

Preprint

View full text Add to dashboard Cite

Traditional methods for unsupervised learning of finite mixture models require to evaluate the likelihood of all components of the mixture. This becomes computationally prohibitive when the number of components is large, as it is, for example, in the sum-product (transform) networks. Therefore, we propose to apply a combination of the expectation maximization and the Metropolis-Hastings algorithm to evaluate only a small number of, stochastically sampled, components, thus substantially reducing the computational cost. The Markov chain of component assignments is sequentially generated across the algorithm's iterations, having a nonstationary target distribution whose parameters vary via a gradient-descent scheme. We put emphasis on generality of our method, equipping it with the ability to train both shallow and deep mixture models which involve complex, and possibly nonlinear, transformations. The performance of our method is illustrated in a variety of synthetic and real-data contexts, considering deep models, such as mixtures of normalizing flows and sum-product (transform) networks.

show abstract

Sketching Datasets for Large-Scale Learning (long version)

Cited by 2 publications

References 51 publications

Compressive Learning for Patch-Based Image Denoising

Compressive Learning for Patch-Based Image Denoising

Fitting large mixture models using stochastic component selection

Contact Info

Product

Resources

About