2011
DOI: 10.1109/tpds.2010.70
|View full text |Cite
|
Sign up to set email alerts
|

Performance Evaluation of Convolution on the Cell Broadband Engine Processor

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
9
0

Year Published

2012
2012
2023
2023

Publication Types

Select...
7
1

Relationship

1
7

Authors

Journals

citations
Cited by 10 publications
(9 citation statements)
references
References 8 publications
0
9
0
Order By: Relevance
“…25 It simulates the evolution of particle distribution functions over a 3D lattice over many timesteps. For each time-step, at each grid point, the computation performed involves directional density values for the grid point and its face (6) and edge (12) neighbors (also referred to as D3Q19).…”
Section: Lattice Boltzmann Methods (Lbm)mentioning
confidence: 99%
See 1 more Smart Citation
“…25 It simulates the evolution of particle distribution functions over a 3D lattice over many timesteps. For each time-step, at each grid point, the computation performed involves directional density values for the grid point and its face (6) and edge (12) neighbors (also referred to as D3Q19).…”
Section: Lattice Boltzmann Methods (Lbm)mentioning
confidence: 99%
“…12 8 K on 1.28 M pixels 1.9 × 10 6 Pixels/sec BlackScholes 20 1 M call + put options 8. compiler generated extra spill/fill instructions, that resulted in performance gap of 1.4X. 5: LIBOR: LIBOR 4 has an outer loop over all paths of the Monte Carlo simulation, and an inner loop over the forward rates on a single path.…”
Section: Best Optimized Performancementioning
confidence: 99%
“…Complicated applications often require a huge amount of available computing processing and network capacity, provided as an infrastructure as a service (IaaS), in support of large-scale experiments. Examples of such applications are in the domain of seismic imaging [14], aerospace [15], meteorology [16], and convolution-based applications [9]. In this work, we focus on divisible load applications, where an application load can be divided into a number of tasks that can be processed independently in parallel ( [17][18][19]).…”
Section: Background and Motivationsmentioning
confidence: 99%
“…For instance, it is used in Google map-reduce programming model [7]; i.e., the rationale of why we have chosen the master-worker model as a base model. The master-worker model arises in divisible-load applications, where there is no communication between the workers, such as search for a pattern, compression, join, graph coloring and generic search applications [8], convolution-based applications [9], and image processing applications [10]. Furthermore, as stated in [5], a Cloud is considered as a candidate platform to run heavy-load applications to satisfy performance requirements.…”
Section: Introductionmentioning
confidence: 99%
“…Several methods have been used to assist NCC for reducing its searching and computing times in image matching, such as the pyramid method [3,7]. In addition, many parallel algorithms of the inner-product have been published that can perform fast cross-correlation for NCC [16,17], where the Distributed Arithmetic (DA) with look-up table has not multiplication, but needs much Read-Only Memory (ROM) [18].…”
Section: Introductionmentioning
confidence: 99%