2013
DOI: 10.1109/tap.2013.2258882
|View full text |Cite
|
Sign up to set email alerts
|

An OpenMP-CUDA Implementation of Multilevel Fast Multipole Algorithm for Electromagnetic Simulation on Multi-GPU Computing Systems

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
34
0

Year Published

2016
2016
2023
2023

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 66 publications
(34 citation statements)
references
References 19 publications
0
34
0
Order By: Relevance
“…The implementation of aggregation and disaggregation at finest level on GPU was proposed by means of allocating a thread to each spectrum point [16]. To increase the utilization of GPU on Kepler architecture (GK110) further, which has a maximum value of 32 threads in one warp, a smart scheme is designed with two steps.…”
Section: Principle Of Mlfmm and Its Optimization On Gpumentioning
confidence: 99%
See 2 more Smart Citations
“…The implementation of aggregation and disaggregation at finest level on GPU was proposed by means of allocating a thread to each spectrum point [16]. To increase the utilization of GPU on Kepler architecture (GK110) further, which has a maximum value of 32 threads in one warp, a smart scheme is designed with two steps.…”
Section: Principle Of Mlfmm and Its Optimization On Gpumentioning
confidence: 99%
“…Meanwhile, all the 32 threads in a warp can read the data from constant memory according to a certain spectrum point through the read-only cache by using the instruction "__ldg()" or "__restrict__". Unlike the strategy of thread-based task assignment proposed for aggregation and disaggregation at coarser level [16], we make a further step in data storage by using the four times larger texture memory in Kepler than Fermi. Since the local inter-/anterpolation accesses neighboring data frequently, it had better store data in a certain pattern form looking like the geometric topology in texture memory.…”
Section: Principle Of Mlfmm and Its Optimization On Gpumentioning
confidence: 99%
See 1 more Smart Citation
“…Indeed, many accurate and fast numerical methods have been developed in the last decades for scattering calculations, fast antenna analysis and RCS predictions, facing the important trade-off between the accuracy of the results and the rapidity of the simulations [2][3][4][5][6][7].…”
Section: Introductionmentioning
confidence: 99%
“…Parallelism is the future of computing [9] and the interest of the Antennas and Propagation community in topics of high performance computing and, in particular, of parallel programming on GPUs to face computationally burdened problems has been remarkable, as witnessed by [2][3][4][5][6][7] and by other electromagnetic numerical methods which have benefitted from GPU computing [10][11][12][13][14][15][16][17]. From this starting point, it is clear that the electromagnetic community can take advantage of this technological evolution to employ ever-more sophisticated numerical methods.…”
Section: Introductionmentioning
confidence: 99%