Abstract-Many important problems in computational sciences, social network analysis, security, and business analytics, are data-intensive and lend themselves to graph-theoretical analyses. In this paper we investigate the challenges involved in exploring very large graphs by designing a breadth-first search (BFS) algorithm for advanced multi-core processors that are likely to become the building blocks of future exascale systems. Our new methodology for large-scale graph analytics combines a highlevel algorithmic design that captures the machine-independent aspects, to guarantee portability with performance to future processors, with an implementation that embeds processorspecific optimizations. We present an experimental study that uses state-of-the-art Intel Nehalem EP and EX processors and up to 64 threads in a single system. Our performance on several benchmark problems representative of the power-law graphs found in real-world problems reaches processing rates that are competitive with supercomputing results in the recent literature. In the experimental evaluation we prove that our graph exploration algorithm running on a 4-socket Nehalem EX is (1) 2.4 times faster than a Cray XMT with 128 processors when exploring a random graph with 64 million vertices and 512 millions edges, (2) capable of processing 550 million edges per second with an R-MAT graph with 200 million vertices and 1 billion edges, comparable to the performance of a similar graph on a Cray MTA-2 with 40 processors and (3) 5 times faster than 256 BlueGene/L processors on a graph with average degree 50.
By 2010, the global options and equity markets will average over 128 billion messages per day, amounting to trillions of dollars in trades. Trading systems, the backbone of the low-latency high-frequency business, need fundamental research and innovation to overcome their current processing bottlenecks. With market data rates rapidly growing, the financial community is demanding solutions that are extremely fast, flexible, adaptive, and easy to manage. This paper explores multiple avenues to deal with the decoding and normalization of Option Price Reporting Authority (OPRA) stock market data feeds encoded with FIX Adapted for Streaming (FAST) representation, on commodity multicore platforms, and describes a novel solution that encodes the OPRA protocol with a high-level description language. Our algorithm achieves a processing rate of 15 million messages per second in the fastest single socket configuration on an Intel Xeon E5472, which is an order of magnitude higher than the current needs of the financial systems. We also present an in-depth performance evaluation that exposes important properties of our OPRA parsing algorithm on a collection of multicore processors.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.