2009 IEEE International Symposium on Parallel &Amp; Distributed Processing 2009
DOI: 10.1109/ipdps.2009.5161058
|View full text |Cite
|
Sign up to set email alerts
|

Singular value decomposition on GPU using CUDA

Abstract: Abstract

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

1
48
0
4

Year Published

2010
2010
2024
2024

Publication Types

Select...
4
4

Relationship

0
8

Authors

Journals

citations
Cited by 104 publications
(53 citation statements)
references
References 16 publications
1
48
0
4
Order By: Relevance
“…In LSA the dimension of the vector space is reduced by singular value decomposition (SVD), which is a computationally demanding operation. By a bidiagonalization method, a speedup up to 8x has been achieved for large matrices [26]. The baseline CPU implementation used Basic Linear Algebra Subprograms (BLAS) calls of the highly optimized Intel Math Kernel Library on an Intel Dual Core 2.66GHz PC, whereas the GPU version relied on Compute Unified BLAS (CUBLAS) running on an NVIDIA GTX 280.…”
Section: Text Mining On Gpusmentioning
confidence: 99%
“…In LSA the dimension of the vector space is reduced by singular value decomposition (SVD), which is a computationally demanding operation. By a bidiagonalization method, a speedup up to 8x has been achieved for large matrices [26]. The baseline CPU implementation used Basic Linear Algebra Subprograms (BLAS) calls of the highly optimized Intel Math Kernel Library on an Intel Dual Core 2.66GHz PC, whereas the GPU version relied on Compute Unified BLAS (CUBLAS) running on an NVIDIA GTX 280.…”
Section: Text Mining On Gpusmentioning
confidence: 99%
“…Many of these tools are common but demand effort to be efficiently implemented into GPU formalism. Thankfully, they are progressively ported to CUDA, like the SVD implementation of Lahabar and Narayanan [15]. However, their integration into an existent framework is not always straightforward as special data structures are often required.…”
Section: Accuracy and Robustnessmentioning
confidence: 99%
“…GPU-based QR decomposition was also studied by [8] using blocked Houserholder reflections. Recently, Lahabar et al [9] presented a GPUbased SVD algorithm built upon the Golub-Reinsch method. They achieve up to 8× speedup over an Intel MKL implementation running on dual core CPU.…”
Section: Introductionmentioning
confidence: 99%