HAsim: FPGA-based high-detail multicore simulation using time-division multiplexing

Pellauer, Michael; Adler, Michael; Kinsy, Michel A.; Parashar, Angshuman; Emer, Joel

doi:10.1109/hpca.2011.5749747

Cited by 73 publications

(48 citation statements)

References 16 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…not neighboring) sets of routers are simulated at the same time. This issue has been studied in previous work [5] and a straightforward way to resolve it would be through a separate preprocessing step that identifies independent sets of routers in the network and then generates a fixed valid simulation schedule.…”

Section: Discussionmentioning

confidence: 99%

“…As has been shown in previous work [5], [6], time-multiplexing in the context of network simulation requires special care to retain proper ordering of events and careful state management to ensure that all routers in the network have a consistent view of the system. For instance, within a single target cycle, a router might send traffic to routers that were simulated in previous host cycles, but also send traffic to routers that will be simulated in subsequent host cycles.…”

Section: B Virtualized Implementationmentioning

confidence: 99%

See 1 more Smart Citation

Fast scalable FPGA-based Network-on-Chip simulation models

Papamichael

2011

Ninth ACM/IEEE International Conference on Formal Methods and Models for Codesign (MEMPCODE2011)

View full text Add to dashboard Cite

Abstract-This paper presents a set of two FPGA-based Network-on-Chip (NoC) simulation engines that composed the winning design of the 2011 MEMOCODE Design Contest in the absolute performance class. Both simulation engines were developed in Bluespec System Verilog (BSV) and were implemented on a Xilinx ML605 FPGA development board. For smaller networks and simpler router configurations a directmapped approach was employed, where the network to be simulated was directly implemented on the FPGA. For larger networks, where a direct-mapped approach is not feasible due to FPGA resource limitations, a virtualized time-multiplexed approach was used. Compared to the provided software reference implementation, our direct-mapped approach achieves three orders of magnitude speedup, while our virtualized timemultiplexed approach achieves one to two orders of magnitude speedup, depending on the network and router configuration.

show abstract

Section: Discussionmentioning

confidence: 99%

Section: B Virtualized Implementationmentioning

confidence: 99%

Fast scalable FPGA-based Network-on-Chip simulation models

Papamichael

2011

Ninth ACM/IEEE International Conference on Formal Methods and Models for Codesign (MEMPCODE2011)

View full text Add to dashboard Cite

show abstract

“…That is the case of RAMP Gold [17] and FAST [18]. In contrast, HaSim [19] and Arete [13] use A-Ports or Latency-Insensitive Bounded Data-flow Networks, which decouple FPGA cycles from model cycles while guaranteeing cycle-accuracy. HaSim uses time multiplexing, which does not scale.…”

Section: Fpga-based Multicores and Simulatorsmentioning

confidence: 99%

High-Level Debugging and Verification for FPGA-Based Multicore Architectures

Arcas-Abella

Cristal

Unsal

2015

2015 IEEE 23rd Annual International Symposium on Field-Programmable Custom Computing Machines

View full text Add to dashboard Cite

Abstract-Simulators are key tools for computer architecture research. However, multicore architectures represent a highly complex challenge for software simulators, which may suffer from fidelity loss and long execution times. FPGAs can simulate multicore architectures with scalable performance and high accuracy, but the difficulty of debugging could hinder their adoption.In this paper we propose several techniques for inspection, debugging and verification of multicore architectures, both for software-based and FPGA-based simulations. These debugging extensions are cycle-accurate and unobtrusive. As a proof of concept, we have developed a 24-core RISC multiprocessor that runs the Linux Kernel, for which we provide three simulation modes: a fast, functional simulation; a detailed, cycle-accurate simulation; and a FPGA-based simulation. Our platform can run up to 24 cores and perform full-system verification at 17 million instructions per second.

show abstract

“…4 The packet receiver will then wait until it has enough space to hold the bytes of the packet and then respond with the assertion of the read_enb_X (read_enb_0, read_enb1 or read_enb_2) signal that is an input to the router. 5 The read_enb_X (read_enb0, read_enb_1 or read_enb_2) input signal can be asserted on the rising/falling clock edge in which data are read from the data_out_X (data_out_0, data_out_1 or data_out_2) bus. 6 As long as the read_enb_X (read_enb_0, read_enb_1 or read_enb_2) signal remains active, the data_out_X (data_out_0, data_out_1 or data_out_2) bus drives a valid packet byte on each rising clock edge.…”

Section: Router Output Protocolmentioning

confidence: 99%

An Area Optimized Robust Router Design Implementation

Rao¹

2013

IOSR-JECE

View full text Add to dashboard Cite

HAsim: FPGA-based high-detail multicore simulation using time-division multiplexing

Cited by 73 publications

References 16 publications

Fast scalable FPGA-based Network-on-Chip simulation models

Fast scalable FPGA-based Network-on-Chip simulation models

High-Level Debugging and Verification for FPGA-Based Multicore Architectures

An Area Optimized Robust Router Design Implementation

Contact Info

Product

Resources

About