Scientific Computing Applications on a Stream Processor

Zhang, Ying; Yang, Xuejun; Wang, Guibin; Rogers, Ian; Li, Gen; Deng, Yu; Yan, Xiu‐Ping

doi:10.1109/ispass.2008.4510743

Cited by 6 publications

(2 citation statements)

References 14 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Stream processor can be considered as the concrete implementation of stream computing model, as well as NVIDIA GPUs being typical stream processor device. But GPGPU (General-Purpose computing on Graphics Processing) [5,10] greatly limit the powerful parallel processing ability of GPU for too much reliance on graphical API interface. CUDA (Compute Unified Device Architecture) technique provides direct access interface of hardware, and extricates from using graphical API interface to access GPU which is indirect realization of GPGPU.…”

Section: Introductionmentioning

confidence: 99%

Accelerate the Training Process of BP neural Network with CUDA Technology

Xie¹

2017

J. Math. Computer Sci.

View full text Add to dashboard Cite

NVIDIA GPUs is a typical Stream Processor device, and have a high performance of floating-point operations. CUDA uses a bran-new computing architecture, and provides greater computing ability for large scale data computing application than CPU. The learning algorithm of BP neural network has a high compute-intensive and rules, and be very suitable for the Stream Processor architecture. Using CUDA technology, the CUBLAS mathematical library and self-Kernels library, supported by NV Geforce GTX280 as hardware, modify the study algorithm ecome parallel, definite a parallel data structure, and describe the mapping mechanism for computing tasks on CUDA and the key algorithm. Compare the parallel study algorithm achieved on GTX280 with the serial algorithm on CPU in a simulation experiment. Improve the training time by as much as nearly 15 times.

show abstract

Section: Introductionmentioning

confidence: 99%

Accelerate the Training Process of BP neural Network with CUDA Technology

Xie¹

2017

J. Math. Computer Sci.

View full text Add to dashboard Cite

show abstract

“…High performance digital signal processing becomes more and more important for the real application in different domain [1] [2]. These applications always need to deal with great amount of data, lots of loop processes.…”

Section: Introductionmentioning

confidence: 99%

A novel architecture scheme with adaptive pipeline coupling technique for DSP processor design

Tang

Xie

Mao

2013

2013 IEEE 10th International Conference on ASIC

View full text Add to dashboard Cite

The processors' architecture design plays an important role in high performance DSP era, where how to balance the power consumption and the computing ability is always a great concern. In this paper we propose an architecture scheme with VLIW instruction driven adaptive pipeline coupling technique for a multi-core processor design to achieve the high computing performance with a low powered capability. Combined with the loop buffering design and implementation, the scheme is evaluated with the typical DSP application and the results show that the performance is improved about 43.4% while the power consumption is reduced by 48.7% in average.

show abstract

Exploiting the Inter-cluster Record Reuse for Stream Processors

Zhang

Sun

et al. 2014

2014 IEEE Intl Conf on High Performance Computing and Communications, 2014 IEEE 6th Intl Symp on Cyberspace Safety and Security

View full text Add to dashboard Cite

Scientific Computing Applications on a Stream Processor

Cited by 6 publications

References 14 publications

Accelerate the Training Process of BP neural Network with CUDA Technology

Accelerate the Training Process of BP neural Network with CUDA Technology

A novel architecture scheme with adaptive pipeline coupling technique for DSP processor design

Exploiting the Inter-cluster Record Reuse for Stream Processors

Contact Info

Product

Resources

About