Pow er density and cooling issues ar e limiting the per for mance of high per for mance chip mult i pr ocessor s (CM P) and off-chip communications cur r ently consume over 20% of pow er for memor y, coher ence, PCI and Ether net link s. Photonic tr ansceiver s integr ated w ith CM Ps ar e being developed to over come these issues, potentially allow ing low hop count sw i t ched connections betw een chips or data center ser ver s. H ow ever , latency in setting up optical connections is cr itically impor tant in all computing applications and having tr ansceiver s integr ated on the pr ocessor chip also pushes other netw or k functions and their associated pow er consumption onto the chip. I n this paper , w e pr opose a low latency optical sw itch ar chitectur e w hich minimizes pow er consumed on the pr ocessor chip for tw o scenar ios: multiple sock et shar ed memor y coher ence netw or k s and optical top-of-r ack sw i t ches for dat a cent er s. T he sw itch ar chitectur e r educes pow er consumed on the CM P using a contr ol plane w ith a simplified send and for get ser ver inter face and the use of a hybr id M ach-Zehnder I nter fer ometer (M ZI ) and semiconductor optical amplifier (SOA) integr ated optical sw i t ch w ith electr onic buffer ing. Results show that the pr oposed ar chitectur e offer s a 42 % r eduction in head latency at low loads compar ed w ith a conventional scheduled optical sw itch as w ell as offer ing incr eased per for mance for str eaming and incast tr affic patter ns. Pow er dissipated on the ser ver chip is show n to be r educed by over 60% compar ed w ith a M anuscr ipt r eceived July 1 2014. S. L iu was wit h t he Elect r onic Engineer ing Depar t ment , Univer sit y College L ondon, L ondon WC1E 7JE, U K . She is now wit h Bar clays I nvest ment Bank, L ondon, UK .
Optical Network on Chips (NoCs) based on silicon photonics have been proposed to reduce latency and power consumption in future chip multi-core processors (CMP). However, high performance CMPs use a shared memory model which generates large numbers of short messages, typically of the order of 8-256B. Messages of this length create high overhead for optical switching systems due to arbitration and switching times. Current schemes only start the arbitration process when the message arrives at the input buffer of the network. In this paper, we propose a scheme which intelligently uses the information from the memory controllers to schedule optical paths. We identified predictable patterns of messages associated with memory operations for a 32 core x86 system using the MESI coherency protocol. We used the first message of each pattern to open the optical paths which will be used by all subsequent messages thereby eliminating arbitration time for the latter. Without considering the initial request message, this scheme can therefore reduce the time of flight of a data message in the network by 29% and that of a control message by 67%. We demonstrate the benefits of this scheduling algorithm for applications in the PARSEC benchmark suite with overall average reductions in overhead latency per message, of 31.8% for the streamcluster benchmark and 70.6% for the swaptions benchmark.
Optical networks on chip based on silicon photonics have been proposed to reduce latency and power consumption in future chip multiprocessors. However, high performance chip multiprocessors use a shared memory model, which generates large numbers of short messages, creating high arbitration latency overhead for photonic switching networks. In this paper, we explore techniques that intelligently use information from the memory hierarchy to predict communication in order to setup photonic circuits with reduced or eliminated arbitration latency. Firstly, we present a switch scheduling algorithm, which arbitrates on a per memory transaction basis and holds open photonic circuits to exploit temporal locality. We show that this can reduce the average arbitration latency overhead by 60% and eliminate arbitration latency altogether for up to 70% of memory transactions. We then demonstrate that this switch scheduling algorithm operating with a central photonic crossbar or Clos switch has significant energy efficiency benefits over arbitrationfree photonic networks such as single writer multiple reader networks. Finally, we demonstrate that cache miss prediction can be used to predict 86% of more complex memory transactions involving multiple nodes or main memory.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.