Forecasting ocean drift trajectories are important for many applications, including search and rescue operations, oil spill cleanup and iceberg risk mitigation. In an operational setting, forecasts of drift trajectories are produced based on computationally demanding forecasts of three-dimensional ocean currents. Herein, we investigate a complementary approach for shorter time scales by using a recent state-of-the-art implicit equal-weights particle filter applied to a simplified ocean model. To achieve this, we present a new algorithmic design for a data-assimilation system in which all components -including the model, model errors, and particle filter -take advantage of massively parallel compute architectures, such as graphical processing units. Faster computations can enable in-situ and ad-hoc model runs for emergency management, and larger ensembles for better uncertainty quantification. Using a challenging test case with near-realistic chaotic instabilities, we run data-assimilation experiments based on synthetic observations from drifting and moored buoys, and analyse the trajectory forecasts for the drifters. Our results show that even sparse drifter observations are sufficient to significantly improve short-term drift forecasts up to twelve hours. With equidistant moored buoys observing only 0.1% of the state space, the ensemble gives an accurate description of the true state after data assimilation followed by a high-quality probabilistic forecast. * Corresponding author: havard.heitlo.holm@sintef.no 1 arXiv:1910.01031v1 [stat.CO] 2 Oct 2019 which is an offline trajectory model. It reads the ocean current forecasts produced by the ocean circulation models, and uses these to predict drift trajectories. Although OpenDrift is computationally efficient, the ocean circulation models still require access to supercomputers. This paper explores the option of using a state-of-the-art particle filter method applied to a simplified ocean model for efficient drift trajectory forecasting. The aim is to build a data-assimilation system that can run efficiently on commodity-level desktop computers, and also be extendable to supercomputers. We achieve this by using a simplified ocean model and a data-assimilation method that both are able to take advantage of massively parallel accelerator hardware, such as the graphical processing unit (GPU). This work is not intended as a substitute of current operational systems, but as a complementary approach, in which the predicted currents may even be updated with in-situ observations, e.g., during ongoing search and rescue operations. Furthermore, by enabling research models to run on individual desktop and laptop computers, researchers are able to do more rapid prototyping. At the same time, this work will contribute to more efficient simulations also on supercomputers, since all algorithms may be extended to run on multiple GPUs and compute nodes.The paper is organized as follows: We start by reviewing related work relevant for Lagrangian data assimilation with accelerated pa...
The shallow-water equations in a rotating frame of reference are important for capturing geophysical flows in the ocean. In this paper, we examine and compare two traditional finite-difference schemes and two modern finite-volume schemes for simulating these equations. We evaluate how well they capture the relevant physics for problems such as storm surge and drift trajectory modelling, and the schemes are put through a set of six test cases. The results are presented in a systematic manner through several tables, and we compare the qualitative and quantitative performance from a cost-benefit perspective. Of the four schemes, one of the traditional finitedifference schemes performs best in cases dominated by geostrophic balance, and one of the modern finite-volume schemes is superior for capturing gravity-driven motion. The traditional finite-difference schemes are significantly faster computationally than the modern finite-volume schemes.
In this work, we examine the performance, energy efficiency and usability when using Python for developing HPC codes running on the GPU. We investigate the portability of performance and energy efficiency between CUDA and OpenCL; between GPU generations; and between low-end, mid-range and high-end GPUs. Our findings show that the impact of using Python is negligible for our applications, and furthermore, CUDA and OpenCL applications tuned to an equivalent level can in many cases obtain the same computational performance. Our experiments show that performance in general varies more between different GPUs than between using CUDA and OpenCL. We also show that tuning for performance is a good way of tuning for energy efficiency, but that specific tuning is needed to obtain optimal energy efficiency.We show that accessing the GPU from Python is as efficient as from C/C++ in many cases, demonstrate how profile-driven development in Python is essential for increasing performance for GPU code (up to 5 times), and show that the energy efficiency increases proportionally with performance tuning. Finally, we investigate the portability of the improvements and power efficiency both between CUDA and OpenCL and between different GPUs. Our findings are summarized in tables that justify that using Python can be preferable to C++, and that using CUDA can be preferable to using OpenCL. Our observations should be directly transferable to other similar architectures and problems. Related WorkThere are several high-level programming languages and libraries that offer access to the GPU for certain sets of problems and algorithms. OpenACC [14] is one example which is pragma-based and offers a set of directives to execute code in parallel on the GPU. However, such high-level abstractions are typically only efficient for certain classes of problems and are often unsuitable for non-trivial parallelization or data movement. CUDA [15] and OpenCL [16] are two programming languages that offer full access to the GPU hardware, including the whole memory subsystem. This is an especially important point, since memory movement is a key bottleneck in many numerical algorithms [6] and therefore has a significant impact on energy consumption.The performance of GPUs has been reported extensively [17], and several authors have shown that GPUs are efficient in terms of energy-to-solution. Huang et al. [18] demonstrated early on that GPUs could not only speed up computational performance, but also increase power efficiency dramatically using CUDA. Qi et al. [19] show how OpenCL on a mobile GPU can increase performance of the discrete Fourier transform by 1.4 times and decrease the energy use by 37%. Dong et al. [20] analyze the energy efficiency of GPU BLAST which simulates compressible hydrodynamics using finite elements with CUDA and report a 2.5 times speedup and a 42% increase in energy efficiency. Klôh [21] report that there is a wide spread in terms of energy efficiency and performance when comparing 3D wave propagation and full waveform inversio...
In this work, we take a modern high-resolution finite-volume scheme for solving the rotational shallowwater equations and extend it with features required to run real-world ocean simulations. Our contributions include a spatially varying north vector and Coriolis term required for large scale domains, moving wet-dry fronts, a static land mask, bottom shear stress, wind forcing, boundary conditions for nesting in a global model, and an efficient model reformulation that makes it well-suited for massively parallel implementations. Our model order is verified using a grid convergence test, and we show numerical experiments using three different sections along the coast of Norway based on data originating from operational forecasts run at the Norwegian Meteorological Institute. Our simulation framework shows perfect weak scaling on a modern P100 GPU, and is capable of providing tidal wave forecasts that are very close to the operational model at a fraction of the cost. All source code and data used in this work are publicly available under open licenses.
This paper provides a bivariate distribution of wave power and significant wave height, as well as a bivariate distribution of wave power and a characteristic wave period for sea states, and the statistical aspects of wave power for sea states are discussed. This is relevant for, e.g., making assessments of wave power devices and their potential for converting energy from waves. The results can be applied to compare systematically the wave power potential at different locations based on long term statistical description of the wave climate.
The Finite Element Method (FEM) has been extensively applied to model failure of ice but suffers from mesh distortion especially when fracture or fragmentation is involved. The Smoothed Particle Hydrodynamics (SPH) method can avoid this class of problems and has been successfully applied to simulate fracture of solids. However, the associated computational cost is increased considerably for SPH simulations. Reducing the computational time without any significant decrease in accuracy is naturally quite critical for SPH to be considered as an efficient numerical tool to simulate failure. Thus, the focus of the present article was to investigate the feasibility of different approaches of domain decomposition, mass scaling, time scaling and coupled SPH-FEM techniques to improve the computational resource requirement. The accuracy, efficiency and limitations of each of these approaches were discussed and the results were compared with analytical solution and four-point beam bending experiments. The results drawn from the comparisons substantiate that domain decomposition, mass scaling and process time scaling can be adequately used for quasi-static cases to reduce the CPU requirements as long as the kinetic energy is constantly monitored to ensure that inertial effects are negligible. Furthermore, as the computational time primarily depends on the number of discrete particles in a simulation, coupled SPH-FEM method was identified as a viable alternative to reduce the simulation time and the results from such coupled simulations seem to agree well with published experimental data. This study showed that the proposed methods were not only able to emulate the failure mechanisms observed during experimental investigations but also reduce the computational resource requirements associated with pure SPH simulations, without any significant reduction in numerical accuracy and stability.
In this work, we perform fully nonlinear data assimilation of ocean drift trajectories using multiple GPUs. We use an ensemble of up to 10000 members and the sequential importance resampling algorithm to assimilate observations of drift trajectories into the underlying shallow-water simulation model. Our results show an improved drift trajectory forecast using data assimilation for a complex and realistic simulation scenario, and the implementation exhibits good weak and strong scaling.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.