CPU/GPU computing allows scientists to tremendously accelerate their numerical codes. In this paper, we port and optimize a double precision alternating direction implicit (ADI) solver for three-dimensional compressible Navier-Stokes equations from our in-house Computational Fluid Dynamics (CFD) software on heterogeneous platform. First, we implement a full GPU version of the ADI solver to remove a lot of redundant data transfers between CPU and GPU, and then design two fine-grain schemes, namely "one-thread-one-point" and "one-thread-one-line", to maximize the performance. Second, we present a dual-level parallelization scheme using the CPU/GPU collaborative model to exploit the computational resources of both multi-core CPUs and many-core GPUs within the heterogeneous platform. Finally, considering the fact that memory on a single node becomes inadequate when the simulation size grows, we present a trilevel hybrid programming pattern MPI-OpenMP-CUDA that merges fine-grain parallelism using OpenMP and CUDA threads with coarse-grain parallelism using MPI for inter-node communication. We also propose a strategy to overlap the computation with communication using the advanced features of CUDA and MPI programming. We obtain speedups of 6.0 for the ADI solver on one Tesla M2050 GPU in contrast to two Xeon X5670 CPUs. Scalability tests show that our implementation can offer significant performance improvement on heterogeneous platform.
The design and development of the complex environment wind engineering simulation software CEWES was carried out, relying on the National Numerical Wind Tunnel Project (NNW). First, based on the characteristics of the physical problem that the software aims to solve, the requirements for the development of complex environment wind engineering simulation software are proposed, and three main modules of the software will be developed: structured grid flow field solver, unstructured grid flow field solvers modeling module of complex terrain and surface. Subsequently, the appropriate mathematical and physical model and numerical solution algorithm are selected for the flow field solver. The CEWES software uses the finite volume method for discretization with second-order accuracy, solves the RANS equations based on the SIMPLE algorithm, uses the k-ε turbulence model to solve the turbulence, and supports large-scale parallelism calculation. Third, the software design was carried out in accordance with the requirements of the CFD solution process and the modular program, focusing on the program architecture, data structure and subroutine interface design, and coding implementation based on the detailed design. Finally, the CEWES software was tested with typical examples. The test results of the calculation examples show that the software calculation results have good accuracy and large-scale parallel computing capabilities, and are suitable for wind engineering simulations in complex terrain environments.
As a conservative, high-order accurate, shock-capturing method, weighted essentially non-oscillatory (WENO) scheme have been widely used to effectively resolve complicated flow structures in computational fluid dynamics (CFD) simulations. However, using a high-order WENO scheme can be highly time-consuming, which greatly limits the CFD application's performance efficiency. In this paper, we present various parallel strategies base on the latest many-core platform such as NVIDIA Fermi GPU, NVIDIA Kepler GPU and Intel MIC coprocessor to accelerate a high-order WENO scheme. Comparison analysis of the two generations GPUs between Fermi and Kepler, and cross-platform performance analysis (focusing on Kepler GPU and MIC) are also detailed discussed. The experiments show that the Kepler GPU offers a clear advantage in contrast to the previous Fermi GPU maintaining exactly the same source code. Furthermore, while Kepler GPU can be several times faster than MIC without utilizing the increasingly available SIMD computing power on Vector Processing Unit (VPU), MIC can provide the computing capability equivalent to Kepler GPU when VPU is utilized. Our implementations and optimization techniques can serve as case studies for paralleling high-order schemes on many-core architectures.
Regarding the practicality of the quality evaluation model, the lack of quantitative experimental evaluation affects the effective use of the quality model, and also a lack of effective guidance for choosing the model. Aiming at this problem, based on the sensitivity of the quality evaluation model to code defects, a machine learningbased quality evaluation attribute validity verification method is proposed. This method conducts comparative experiments by controlling variables. First, extract the basic metric elements; then, convert them into quality attributes of the software; finally, to verify the quality evaluation model and the effectiveness of medium quality attributes, this paper compares machine learning methods based on quality attributes with those based on text features, and conducts experimental evaluation in two data sets. The result shows that the effectiveness of quality attributes under control variables is better, and leads by 15% in AdaBoostClassifier; when the text feature extraction method is increased to 50 -150 dimensions, the performance of the text feature in the four machine learning algorithms overtakes the quality attributes; but when the peak is reached, quality attributes are more stable. This also provides a direction for the optimization of the quality model and the use of quality assessment in different situations.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.