As the high performance computing systems scale up, mapping the tasks of a parallel application onto physical processors to allow efficient communication becomes one of the critical performance issues. Existing algorithms were usually designed to map applications with regular communication patterns. Their mapping criterion usually overlooks the size of communicated messages, which is the primary factor of communication time. In addition, most of their time complexities are too high to process large scale problems. In this paper, we present a hierarchical mapping algorithm (HMA), which is capable of mapping applications with irregular communication patterns. It first partitions tasks according to their run-time communication information. The tasks that communicate with each other more frequently are regarded as strongly connected. Based on their connectivity strength, the tasks are partitioned into supernodes based on the algorithms in spectral graph theory. The hierarchical partitioning reduces the mapping algorithm complexity to achieve scalability. Finally, the run-time communication information will be used again in fine tuning to explore better mappings. With the experiments, we show how the mapping algorithm helps to reduce the point-to-point communication time for the PDGEMM, a ScaLAPACK matrix multiplication computation kernel, up to 20% and the AMG2006, a tier 1 application of the Sequoia benchmark, up to 7%.
Box intersection checking is a common task used in many large-scale simulations. Traditional methods cannot provide fast box intersection checking with large-scale datasets. This article presents a parallel algorithm to perform Pairwise Box Intersection checking on Graphics processing units (PBIG). The PBIG algorithm consists of three phases: planning, mapping and checking. The planning phase partitions the space into small cells, the sizes of which are determined to optimize performance. The mapping phase maps the boxes into the cells. The checking phase examines the box intersections in the same cell. Several performance optimizations, including load-balancing, output data compression/encoding, and pipelined execution, are presented for the PBIG algorithm. The experimental results show that the PBIG algorithm can process large-scale datasets and outperforms three well-performing algorithms.
Conventional sub-trajectory clustering is used to identify similarities among multiple trajectories. Existing methods tend to overlook many of the relevant sub-trajectories; others require a road network as input; all are significantly slowed down considerably by large datasets. In this paper, we propose a novel approach to clustering sub-trajectory in which trajectories are transformed into a set of Hypercubes. The Hypercubes are pairwise-matched to find an intersection and then clustered accordingly. The performance of the proposed scheme was compared with that of grid clustering (i.e., constant time technique) in terms of memory usage, computational speed, and compared with a state-of-art method, TraClus, by assessing their accuracy. The experiment results show that Hypercube clustering can identify common sub-trajectories more swiftly and with less memory usage than grid clustering. The accuracy of Hypercube clustering is superior to TraClus. INDEX TERMS Urban computing, similar trajectories, ridesharing paths, common sub-trajectories clustering.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.