The graphic processing unit (GPU) can perform some large-scale simulations in an economical way. However, harnessing the power of a GPU to discrete event simulation (DES) is difficult because of the mismatch between GPU’s synchronous execution mode and DES’s asynchronous time advance mechanism. In this paper, we present a GPU-based simulation kernel (gDES) to support DES and propose three algorithms to support high efficiency. Since both limited parallelism and redundant synchronization affect the performance of DES based on a GPU, we propose a breadth-expansion conservative time window algorithm to increase the degree of parallelism while retaining the number of synchronizations. By using the expansion method, it can import as many as possible ‘safe’ events. The irregular and dynamic requirement for storing the events leads to uneven and sparse memory usage, thereby causing waste of memory and unnecessary overhead. A memory management algorithm is proposed to store events in a balanced and compact way by using a lightweight stochastic method. When events processed by threads in a warp have different types, the performance of gDES decreases rapidly because of branch divergence. An event redistribution algorithm is proposed by reassigning events of the same type to neighboring threads to reduce the probability of branch divergence. We analyze the superiority of the proposed algorithms and gDES with a series of experiments. Compared to a CPU-based simulator on a multicore platform, the gDES can achieve up to 11×, 5×, and 8× speedup in PHOLD, QUEUING NETWORK, and epidemic simulation, respectively.
The graphic processing unit (GPU) brings an opportunity to implement large scale simulations in an economical way. GPU's performance relies on high parallelism, but using synchronous conservative time management algorithm for discrete event simulation will meet the scenarios with limited parallelism. This conflict leads to bad performance even though the application itself has high parallelism. To solve this problem, we propose an expansion-aided synchronous conservative time management algorithm. It uses runtime information to enlarge the time bound of "safe" events, and uses an expansion method to import "safe" events. By interleaving a series of expansions with event computation, more events can be assembled to be processed in parallel. Moreover, a simulated annealing algorithm is adopted to control the number of expansions. It helps achieve stable performance under different conditions by finding a balance between low parallelism and unnecessary expansions. Experiments demonstrate that the proposed algorithm can achieve up to a 30% performance improvement.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.