GPU Computing Gems Jade Edition 2012
DOI: 10.1016/b978-0-12-385963-1.00027-7
|View full text |Cite
|
Sign up to set email alerts
|

GPU Scripting and Code Generation with PyCUDA

Abstract: High-level scripting languages are in many ways polar opposites to GPUs. GPUs are highly parallel, subject to hardware subtleties, and designed for maximum throughput, and they offer a tremendous advance in the performance achievable for a significant number of computational problems. On the other hand, scripting languages such as Python favor ease of use over computational speed and do not generally emphasize parallelism. PyCUDA is a package that attempts to join the two together. This chapter argues that in … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
15
0

Year Published

2013
2013
2021
2021

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 16 publications
(16 citation statements)
references
References 3 publications
0
15
0
Order By: Relevance
“…Our Matlab implementation that is however not optimized for speed and logs large quantities of intermediate results, takes about thrice as long. A Python implementation using PyCUDA (Klöckner et al 2009) for GPU enabled computation of the discrete Fourier transform (see Eqs. (A.2) and (A.3)) achieves a runtime of less than 10 min on a low-cost NVIDIA(R) GeForce(TM) GT 430.…”
Section: Results On Simulated Datamentioning
confidence: 99%
“…Our Matlab implementation that is however not optimized for speed and logs large quantities of intermediate results, takes about thrice as long. A Python implementation using PyCUDA (Klöckner et al 2009) for GPU enabled computation of the discrete Fourier transform (see Eqs. (A.2) and (A.3)) achieves a runtime of less than 10 min on a low-cost NVIDIA(R) GeForce(TM) GT 430.…”
Section: Results On Simulated Datamentioning
confidence: 99%
“…Front-end programming models Many systems provide GPU support in a high-level language: C++ [45], Java [99,8,81,24], Matlab [7,80], Python [25,64]. While some go beyond simple GPU API bindings, and provide support for compiling the high-level language to GPU code, none have Dandelion's cluster-scale support; unlike Dandelion, all expose the underlying device abstraction.…”
Section: Related Workmentioning
confidence: 99%
“…The earliest attempts have been to create wrappers around CUDA and OpenCL API that still require the programmer to write the kernel code by hand and exposing a few vendor specific libraries. Such attempts include PyCUDA [11] and PyOpenCL [12]. The current version of MATLAB's proprietary parallel computing toolbox also falls in this category at the time of writing.…”
Section: Related Workmentioning
confidence: 99%