International audienceFast Multipole Methods (FMM) are a fundamental operation for the simulation of many physical problems. The high performance design of such methods usually requires to carefully tune the algorithm for both the targeted physics and hardware. In this paper, we propose a new approach that achieves high performance across architectures. Our method consists of expressing the FMM algorithm as a task flow and employing a state- of-the-art runtime system, StarPU, to process the tasks on the different computing units. We carefully design the task flow, the mathematical operators, their implementations as well as scheduling schemes. Potentials and forces on 200 million particles are computed in 42.3 seconds on a homogeneous 160 cores SGI Altix UV 100 and good scalability is shown
A new boundary element method (BEM) is developed for three-dimensional analysis of fiber-reinforced composites based on a rigid-inclusion model. Elasticity equations are solved in an elastic domain containing inclusions which can be assumed much stiffer than the host elastic medium. Therefore the inclusions can be treated as rigid ones with only six rigid-body displacements. It is shown that the boundary integral equation (BIE) in this case can be simplified and only the integral with the weakly-singular displacement kernel is present. The BEM accelerated with the fast multipole method is used to solve the established BIE. The developed BEM code is validated with the analytical solution for a rigid sphere in an infinite elastic domain and excellent agreement is achieved. Numerical examples of fiber-reinforced composites, with the number of fibers considered reaching above 5800 and total degrees of freedom above 10 millions, are solved successfully by the developed BEM. Effective Young’s moduli of fiber-reinforced composites are evaluated for uniformly and “randomly” distributed fibers with two different aspect ratios and volume fractions. The developed fast multipole BEM is demonstrated to be very promising for large-scale analysis of fiber-reinforced composites, when the fibers can be assumed rigid relative to the matrix materials.
The barn owl's inferior colliculus contains a retina-like map of space on which a sound generates a focus of activity whose position corresponds to the location of the sound source. When there is more than one source of sound, the sound waves sum and may generate spurious binaural cues that degrade the auditory image. We investigated the signal conditions under which neurons in the owl's auditory space map are able to resolve two simultaneously active sound sources. We recorded from space map neurons responding to sounds from a pair of speakers separated in azimuth by 45 degrees and mounted on a rotatable arm. Stimuli consisted of a sum of sinusoids or pseudorandom noise bursts emitted simultaneously and at equal overall levels. The characteristics of the sounds in each speaker were varied, and the neuron's response was plotted as a function of the speaker pair's position. When the speakers emitted different sets of summed sinusoids, the cells responded to each speaker separately; that is, the cells were able to resolve two separate targets. However, when the speakers emitted identical summed sinusoids generating binaural cues that were identical to those of a single phantom source between the two speakers, the neurons responded when the speakers were on either side of their receptive fields. By manipulating the amplitude at which each speaker emitted the various frequencies, we could control the position, number, and size of the phantom sources detected by the cell. The cells also resolved two separate sources when they emitted noise bursts that were statistically independent or temporally reversed versions of one another. Since the overall spectra of such waveforms are identical, we suggest that the space map relies on differences between noise bursts that exist over brief time spans.
International audienceHigh performance fast multipole method is crucial for the numerical simulation of many physical problems. In a previous study, we have shown that task-based fast multipole method provides the flexibility required to process a wide spectrum of particle distributions efficiently on multicore architectures. In this paper, we now show how such an approach can be extended to fully exploit heterogeneous platforms. For that, we design highly tuned graphics processing unit (GPU) versions of the two dominant operators P2P and M2L) as well as a scheduling strategy that dynamically decides which proportion of subsequent tasks is processed on regular CPU cores and on GPU accelerators. We assess our method with the StarPU runtime system for executing the resulting task flow on an Intel X5650 Nehalem multicore processor possibly enhanced with one, two, or three Nvidia Fermi M2070 or M2090 GPUs (Santa Clara, CA, USA). A detailed experimental study on two 30 million particle distributions (a cube and an ellipsoid) shows that the resulting software consistently achieves high performance across architectures
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.