2013 23rd International Conference on Field Programmable Logic and Applications 2013
DOI: 10.1109/fpl.2013.6645553
|View full text |Cite
|
Sign up to set email alerts
|

TILT: A multithreaded VLIW soft processor family

Abstract: Fig. 1. The TILT Architecture, consisting of a scratchpad, banked, multi-ported memory system and FUs connected by crossbar networks. ABSTRACTWe propose TILT, an FPGA-based compute engine designed to highly-utilize multiple, varied, and deeply-pipelined functional units by leveraging thread-level parallelism and static compiler analysis and scheduling. For this work we focus on deeply-pipelined floating-point functional units of widely-varying latency, executing Hodgkin-Huxley neuron simulation as an example a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2014
2014
2020
2020

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 10 publications
(7 citation statements)
references
References 8 publications
(7 reference statements)
0
7
0
Order By: Relevance
“…Most successful TM overlays are based on soft processors. The more performance oriented ones include, SIMD Octavo [13], VectorBlox MXP [24] and VLIW TILT [19]. A massively parallel overlay, called GRVI Phalanx [7], based on the RISC-V processor and the Hoplite NOC [11] mapped 1680 RISC-V cores onto an UltraScale+ VU9P.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Most successful TM overlays are based on soft processors. The more performance oriented ones include, SIMD Octavo [13], VectorBlox MXP [24] and VLIW TILT [19]. A massively parallel overlay, called GRVI Phalanx [7], based on the RISC-V processor and the Hoplite NOC [11] mapped 1680 RISC-V cores onto an UltraScale+ VU9P.…”
Section: Related Workmentioning
confidence: 99%
“…Similarly for the 2nd cluster, scheduling as: 14,26,21,10,16,11,27,22, resolves dependencies 14-11, 26-27, and 21-22, for all overlay versions. In cluster three, scheduling as: 18,24,28,23,19,30,8, resolves all dependencies for the V4 and V5 overlays, but not for the V3 overlay, which with an IWP of 5 requires 4 operations between dependant nodes. Hence, a single NOP must be added between 23 and 19 which then resolves all 4 sets of dependant instructions.…”
Section: Compiling To the Overlaymentioning
confidence: 99%
“…To improve power consumption and throughput, smaller and faster processor architectures, such as the iDEA processor [22], have been proposed. Examples of multi-threaded and parallel processors include: CUSTARD [23], Octavo [24] and SIMD-Octavo [25], The VectorBlox MXP soft vector processor [26] and the TILT VLIW processor [27].…”
Section: B Time-multiplexed Overlaysmentioning
confidence: 99%
“…In the area of instruction programmable FPGA overlays, active academic research on vector processors [36,37] is going on in the area of embedded computing devices as throughput-optimized alternatives to scalar soft processors. Ovtcharov et al [38] add the concept of GPU-like multithreading to hide latencies of functional units and memory access by pipelining the execution of different threads. As proposed by Kingyens and Steffan [39] and brought forward by Convey with CHOMP [40] as successor to the vector processor utilized in this work, such a GPU-like architecture may be a promising architecture template for acceleration of server-and datacenter-scale computing tasks.…”
Section: Related Workmentioning
confidence: 99%