Abstract:Abstract-This paper introduces adaptive techniques targeted for heterogeneous manycore architectures and introduces the FlexTiles platform, which consists of general purpose processors with some dedicated accelerators. The different components are based on low power DSP cores and an eFPGA on which dedicated IPs can be dynamically configured at run-time. These features enable a breakthrough in term of computing performance while improving the on-line adaptive capabilities brought from smart heuristics. Thus, we… Show more
“…The approach presented in this paper, can efficiently be integrated into the Aethereal NoC (see [11]), which has a high importance in current research projects such as FlexTiles (see [12]). The Aethereal NoC is used in the FlexTiles project, to establish the intertile communication and for the data transfer from and to the on-chip and external memory blocks.…”
This paper presents the hardware architecture and the software abstraction layer of an adaptive multiclient Network-on-Chip (NoC) memory core. The memory core supports the flexibility of a heterogeneous FPGA-based runtime adaptive multiprocessor system called RAMPSoC. The processing elements, also called clients, can access the memory core via the Network-on-Chip (NoC). The memory core supports a dynamic mapping of an address space for the different clients as well as different data transfer modes, such as variable burst sizes. Therefore, two main limitations of FPGA-based multiprocessor systems, the restricted on-chip memory resources and that usually only one physical channel to an off-chip memory exists, are leveraged. Furthermore, a software abstraction layer is introduced, which hides the complexity of the memory core architecture and which provides an easy to use interface for the application programmer. Finally, the advantages of the novel memory core in terms of performance, flexibility, and user friendliness are shown using a real-world image processing application.
“…The approach presented in this paper, can efficiently be integrated into the Aethereal NoC (see [11]), which has a high importance in current research projects such as FlexTiles (see [12]). The Aethereal NoC is used in the FlexTiles project, to establish the intertile communication and for the data transfer from and to the on-chip and external memory blocks.…”
This paper presents the hardware architecture and the software abstraction layer of an adaptive multiclient Network-on-Chip (NoC) memory core. The memory core supports the flexibility of a heterogeneous FPGA-based runtime adaptive multiprocessor system called RAMPSoC. The processing elements, also called clients, can access the memory core via the Network-on-Chip (NoC). The memory core supports a dynamic mapping of an address space for the different clients as well as different data transfer modes, such as variable burst sizes. Therefore, two main limitations of FPGA-based multiprocessor systems, the restricted on-chip memory resources and that usually only one physical channel to an off-chip memory exists, are leveraged. Furthermore, a software abstraction layer is introduced, which hides the complexity of the memory core architecture and which provides an easy to use interface for the application programmer. Finally, the advantages of the novel memory core in terms of performance, flexibility, and user friendliness are shown using a real-world image processing application.
“…Multicore architectures provide significant Space, Weight and Power savings (SWaP) while offering massive computing capabilities compared with single core processors. They are also capable of integrating diverse applications on the same platform [2], [3].…”
Abstract-Multicore architectures have great potential for energy-constrained embedded systems, such as energy-harvesting wireless sensor networks. Some embedded applications, especially the real-time ones, can be modeled as imprecise computation tasks. A task is divided into a mandatory subtask that provides a baseline Quality-of-Service (QoS) and an optional subtask that refines the result to increase the QoS. Combining dynamic voltage and frequency scaling, task allocation and task adjustment, we can maximize the system QoS under real-time and energy supply constraints. However, the nonlinear and combinatorial nature of this problem makes it difficult to solve. This work first formulates a mixed-integer non-linear programming problem to concurrently carry out task-to-processor allocation, frequencyto-task assignment and optional task adjustment. We provide a mixed-integer linear programming form of this formulation without performance degradation and we propose a novel decomposition algorithm to provide an optimal solution with reduced computation time compared to state-of-the-art optimal approaches (22.6% in average). We also propose a heuristic version that has negligible computation time.
“…As the number of IPs (Intellectual Property) increases rapidly inside a chip, interconnecting them becomes increasingly challenging [1]. Network-on-Chip (NoC) is the most efficient architecture to build a many-core interconnect system.…”
Abstract-Multi-FPGA platforms are very popular today for pre-silicon verification of complex designs due to their low cost and high speed. The idea is to divide these systems into smaller sub-systems and implement each one on a separate chip. The challenge is that the number of IOs available on FPGA remains constant despite the technological evolution. This problem is resolved by multiplexing several cut-signals using the time division multiplexing scheduling mechanism. This structure has a strong effect on the speed of transmission between FPGAs. However, an inter-FPGA bottleneck appears. In this paper, we focus on evaluating the Network-on-Chip on multi-FPGA using the high speed serial transceiver GTX block. In order to speed up the transmission between FPGAs, GTX Transceiver is used to provide a high bandwidth while using fewer pins compared to existing approaches based on ordinary FPGA IOs pins. Depending on the available multi-gigabit transceiver, the bandwidth per connection is between 3.125 and 28 Gb/s which allows for large amounts of data to be moved quickly between multiple FPGAs. In our evaluation, a VC707 platform based on the Virtex-7 device is used. The simulation results show that the proposed architecture provides low area consumption and latencies under different traffic patterns.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.