2015 IEEE 23rd Annual International Symposium on Field-Programmable Custom Computing Machines 2015
DOI: 10.1109/fccm.2015.37
|View full text |Cite
|
Sign up to set email alerts
|

Zedwulf: Power-Performance Tradeoffs of a 32-Node Zynq SoC Cluster

Abstract: Commodity SoCs with hybrid architectures that combine CPUs with programmable FPGA fabric such as the Xilinx Zynq SoC have become a competitive energy-efficient platform for addressing irregular parallelism in graph problems. In this paper, we prototype a 32-node cluster composed from these Zynq SoC chips to accelerate communication-bound sparse graphoriented applications such as neural network simulations. We develop specialized MPI routines specifically developed for irregular accelerator-to-accelerator commu… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
14
0

Year Published

2015
2015
2024
2024

Publication Types

Select...
4
2
2

Relationship

0
8

Authors

Journals

citations
Cited by 34 publications
(14 citation statements)
references
References 15 publications
(14 reference statements)
0
14
0
Order By: Relevance
“…Zedwulf is a 32-node multi-FPGA SoC. The inter-FPGA communication is implemented by connecting all the 32 FPGAs via an Ethernet switch (Moorthy and Kapre, 2015). Although it is a good practice at multi-FPGA SoC implementation, the Ethernet-based interconnection introduces much higher latency overhead than MGT interconnection.…”
Section: Zedwulfmentioning
confidence: 99%
See 1 more Smart Citation
“…Zedwulf is a 32-node multi-FPGA SoC. The inter-FPGA communication is implemented by connecting all the 32 FPGAs via an Ethernet switch (Moorthy and Kapre, 2015). Although it is a good practice at multi-FPGA SoC implementation, the Ethernet-based interconnection introduces much higher latency overhead than MGT interconnection.…”
Section: Zedwulfmentioning
confidence: 99%
“…The back-end network with MGT links has no router support. In Zedwulf, FPGAs communicate purely through commodity Ethernet (Moorthy and Kapre, 2015). Few details are publicly available about the Catapult I router (Putnam et al, 2014).…”
Section: Previous Work Of Wormhole Vc-based Routers On Fpgasmentioning
confidence: 99%
“…Hence, there is an inherent need to reduce the time spent in fulfilling the network operations for maximizing performance gains while using distributed systems. We designed an optimized graph-oriented global scatter technique [8] using the Message Passing Interface (MPI) library. Step 1 and…”
Section: Mpi Optimizationmentioning
confidence: 99%
“…For example, ARMbased SoCs with Field-Programmable Gate Array (FPGA) logic have shown significant advantages in power consumption and efficiency in executing highly parallel computation [14]. Other examples of the efficient use of heterogeneous SoCs in server setups are the ZCluster [18] and Zedwulf [19], which improves performance over similar clusters based solely on either ARM CPUs or FPGA enabling elasticity to trade power with performance. Compared to commodity servers, these SoC solutions improve energy-efficiency out of the box.…”
Section: Introductionmentioning
confidence: 99%
“…based on Hadoop [18]. Zedwulf is another SoC cluster which allows exploration of the performance-power trade-off for sparse graph applications [19].…”
Section: Introductionmentioning
confidence: 99%