Figure 1: Lane Formation. Starting from 4 corners of a crossroad, 400 agents cross the steets. As they move in the opposite directions, the agents automatically form lanes.
AbstractWe present a novel approach for interactive navigation and planning of multiple agents in crowded scenes with moving obstacles. Our formulation uses a precomputed roadmap that provides macroscopic, global connectivity for wayfinding and combines it with fast and localized navigation for each agent. At runtime, each agent senses the environment independently and computes a collisionfree path based on an extended "Velocity Obstacles" concept. Furthermore, our algorithm ensures that each agent exhibits no oscillatory behaviors. We have tested the performance of our algorithm in several challenging scenarios with a high density of virtual agents. In practice, the algorithm performance scales almost linearly with the number of agents and can run at interactive rates on multi-core processors.
The term "performance portability" has been informally used in computing to refer to a variety of notions which generally include: 1) the ability to run one application across multiple hardware platforms; and 2) achieving some notional level of performance on these platforms. However, there has been a noticeable lack of consensus on the precise meaning of the term, and authors' conclusions regarding their success (or failure) to achieve performance portability have thus been subjective. Comparing one approach to performance portability with another has generally been marked with vague claims and verbose, qualitative explanation of the comparison. This paper presents a concise definition for performance portability, along with a simple metric that accurately captures the performance and portability of an application across different platforms. The utility of this metric is then demonstrated with a retroactive application to previous work.
We present a novel method for the synthesis and animation of realistic traffic flows on large-scale road networks. Our technique is based on a continuum model of traffic flow we extend to correctly handle lane changes and merges, as well as traffic behaviors due to changes in speed limit. We demonstrate how our method can be applied to the animation of many vehicles in a large-scale traffic network at interactive rates and show that our method can simulate believable traffic flows on publicly-available, real-world road data. We furthermore demonstrate the scalability of this technique on many-core systems.
Deep learning is a promising tool to determine the physical model that describes our universe. To handle the considerable computational cost of this problem, we present CosmoFlow: a highly scalable deep learning application built on top of the TensorFlow framework. CosmoFlow uses efficient implementations of 3D convolution and pooling primitives, together with improvements in threading for many element-wise operations, to improve training performance on Intel ® Xeon Phi™ processors. We also utilize the Cray PE Machine Learning Plugin for efficient scaling to multiple nodes.We demonstrate fully synchronous data-parallel training on 8192 nodes of Cori with 77% parallel efficiency, achieving 3.5 Pflop/s sustained performance. To our knowledge, this is the first large-scale science application of the TensorFlow framework at supercomputer scale with fully-synchronous training. These enhancements enable us to process large 3D dark matter distribution and predict the cosmological parameters ΩM , σ8 and ns with unprecedented accuracy.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.