Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 2015
DOI: 10.1145/2783258.2783373
|View full text |Cite
|
Sign up to set email alerts
|

Large-Scale Distributed Bayesian Matrix Factorization using Stochastic Gradient MCMC

Abstract: Despite having various attractive qualities such as high prediction accuracy and the ability to quantify uncertainty and avoid overfitting, Bayesian Matrix Factorization has not been widely adopted because of the prohibitive cost of inference. In this paper, we propose a scalable distributed Bayesian matrix factorization algorithm using stochastic gradient MCMC. Our algorithm, based on Distributed Stochastic Gradient Langevin Dynamics, can not only match the prediction accuracy of standard MCMC methods like Gi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
48
0

Year Published

2016
2016
2023
2023

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 47 publications
(48 citation statements)
references
References 12 publications
0
48
0
Order By: Relevance
“…per iter. BMF + DSGLD Squared O((N + D)K( √ U + 1)T ) (Ahn et al, 2015) orthogonal NMF + PSGLD Squared O(DKT ) (Şimşekli et al, 2017) orthogonal Distributed BPMF --Load balance O((N + D)K(U − 1)T ) (Vander Aa et al, 2017) Proposed method --Flexible O((N + D)(K + K 2 ) √ U )…”
Section: Models Tune Learningmentioning
confidence: 99%
“…per iter. BMF + DSGLD Squared O((N + D)K( √ U + 1)T ) (Ahn et al, 2015) orthogonal NMF + PSGLD Squared O(DKT ) (Şimşekli et al, 2017) orthogonal Distributed BPMF --Load balance O((N + D)K(U − 1)T ) (Vander Aa et al, 2017) Proposed method --Flexible O((N + D)(K + K 2 ) √ U )…”
Section: Models Tune Learningmentioning
confidence: 99%
“…In this paper, we present an algorithm and software to enable parallelization of CoGAPS to enable analysis of large single cell datasets. This parallelization was done by combining existing methods for Gibbs sampling Ahn et al, 2015;Li et al) with a new infrastructure for the updating steps in CoGAPS. Prior to the implementation of an asynchronous updating scheme, CoGAPS was applied to large data sets by using a distributed version of the algorithm, GWCoGAPS, that performed analysis across random sets of genes (Stein-O'Brien et al, 2017) or random sets of cells .…”
Section: Discussionmentioning
confidence: 99%
“…However, the computational cost of implementing these approaches may be prohibitive for large single cell datasets. Many NMF methods can be run in parallel, and thereby leverage the increasing availability of suitable hardware to scale for analysis of large single cell datasets Ahn et al, 2015;Li et al).…”
Section: Introductionmentioning
confidence: 99%
“…It is an optimization method that attempts to find the values of the model coefficients (the parameter or weight vector) that minimizes the loss function when they cannot be calculated analytically. SGD has proven to achieve state of-the-art performance on a variety of machine learning tasks [3,6]. With its small memory footprint, robustness against noise and fast learning rates, SGD is indeed a good candidate for training data-intensive models.…”
Section: Stochastic Gradient Descentmentioning
confidence: 99%