Programming current supercomputers efficiently is a challenging task. Multiple levels of parallelism on the core, on the compute node, and between nodes need to be exploited to make full use of the system. Heterogeneous hardware architectures with accelerators further complicate the development process. waLBerla addresses these challenges by providing the user with highly efficient building blocks for developing simulations on block-structured grids. The block-structured domain partitioning is flexible enough to handle complex geometries, while the structured grid within each block allows for highly efficient implementations of stencil-based algorithms. We present several example applications realized with waLBerla, ranging from lattice Boltzmann methods to rigid particle simulations. Most importantly, these methods can be coupled together, enabling multiphysics simulations. The framework uses meta-programming techniques to generate highly efficient code for CPUs and GPUs from a symbolic method formulation. To ensure software quality and performance portability, a continuous integration toolchain automatically runs an extensive test suite encompassing multiple compilers, hardware architectures, and software configurations.
Formulating a consistent theory for rigidbody dynamics with impacts is an intricate problem. Twenty years ago Stewart published the first consistent theory with purely inelastic impacts and an impulsive friction model analogous to Coulomb friction. In this paper we demonstrate that the consistent impact model can exhibit multiple solutions with a varying degree of dissipation even in the single-contact case. Replacing the impulsive friction model based on Coulomb friction by a model based on the maximum dissipation principle resolves the non-uniqueness in the single-contact impact problem. The paper constructs the alternative impact model and presents integral equations describing rigidbody dynamics with a non-impulsive and non-compliant contact model and an associated purely inelastic impact model maximizing dissipation. An analytic solution is derived for the single-contact impact problem. The models are then embedded into a time-stepping scheme. The macroscopic behaviour is compared to Coulomb friction in a large-scale granular flow problem.
As compute power increases with time, more involved and larger simulations become possible. However, it gets increasingly difficult to efficiently use the provided computational resources. Especially in particlebased simulations with a spatial domain partitioning large load imbalances can occur due to the simulation being dynamic. Then a static domain partitioning may not be suitable. This can deteriorate the overall runtime of the simulation significantly. Sophisticated load balancing strategies must be designed to alleviate this problem. In this paper we conduct a systematic evaluation of the performance of six different load balancing algorithms. Our tests cover a wide range of simulation sizes, and employ one of the largest supercomputers available. In particular we study the runtime and memory complexity of all components of the simulation carefully. When progressing to extreme scale simulations it is essential to identify bottlenecks and to predict the scaling behaviour. Scaling experiments are shown for up to over one million processes.The performance of each algorithm is analyzed with respect to the quality of the load balancing and its runtime costs. Additionally an applied test case is used to judge the applicability of the best algorithms in real world applications. For all tests, the waLBerla multiphysics framework is employed. processes [6,7]. One important aspect of this initial domain partitioning is to achieve an equal workload for all cores. However, since the simulated system is dynamic, and the particles may migrate between subdomains, the workload might be shifted during the simulation. This leads to load imbalances, that can slow down the whole simulation. To overcome this problem, the domain partitioning must be adapted dynamically throughout the simulation and/or the subdomains must be reassigned to different processes.Many simulation frameworks have therefore adopted load balancing and results are published for simulations of various sizes. Related WorkCompared to rigid body dynamics, molecular dynamics simulations differ in some aspects, however, the load balancing problem is closely related. Therefore we also consider methods proposed in the context of molecular dynamics here. A slightly dated but still relevant review of different methods suitable for load balancing can be found in [8]. Owen et al. [9] use load balancing based on the ParMetis [10] graph partitioning library to balance their combined FEM-DEM simulation. They use two applied test cases namely a 2D bucket filling and a 3D hopper filling example. Measurements with up to 6 cores are presented. Deng et al. [11]present a runtime load balancing approach for molecular dynamics simulations which deforms the domain partitioning at runtime. The initial rectangular grid is optimized by moving the corners of all subdomains individually in space to adjust to the simulation. Good quality of the partitioning is reported for an artificial checkerboard scenario with no acting forces. The load balancing improves the runtime performance but...
Nanometer-thin single-walled carbon nanotube (CNT) films collected from the aerosol chemical deposition reactors have gathered attention for their promising applications. Densification of these pristine films provides an important way to manipulate mechanical, electronic, and optical properties. To elucidate the underlying microstructural level restructuring, which is ultimately responsible for the change in properties, we perform large scale vector-based mesoscopic distinct element method simulations in conjunction with electron microscopy and spectroscopic ellipsometry characterization of pristine and densified films by drop-cast volatile liquid processing. Matching with the microscopy observations, pristine CNT films with a finite thickness are modeled as self-assembled CNT networks comprising entangled dendritic bundles with branches extending down to individual CNTs. Simulations of these films under uniaxial compression uncover a soft deformation regime extending up to an ∼75% strain. When removing the loads, the pre-compressed samples evolve into homogeneously densified films with thickness values depending on both the pre-compression level and the sample microstructure. The significant reduction in thickness is attributed to the underlying structural changes occurring at the 100 nm scale, including the zipping of the thinnest dendritic branches.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.