Abstract:Heterogeneous system-on-chip (SoC) architectures are emerging as a fundamental computing platform across a variety of domains, from mobile to cloud computing. Heterogeneity, however, increases design complexity in terms of hardware-software interactions, access to shared resources, and diminished regularity of the design. Embedded Scalable Platforms are a novel approach to SoC design and programming that addresses these design-complexity challenges by combining an architecture and a methodology. The flexible s… Show more
“…Embedded Scalable Platforms (ESP) is a new approach to SoC design and programming that addresses these challenges by combining a flexible tile-based socketed architecture with a companion system-level design (SLD) methodology. 2 Each tile of an ESP instance can host a processor, I/O peripherals, system utilities, or accelerators, which are typically configurable but not programmable. The selection of the mix of tiles for a target application domain is the result of a design-space exploration guided by the SLD methodology.…”
The complexity of System-on-Chip (SoC) designs continues to grow as each SoC features an increasing variety of loosely coupled accelerators together with multiple processor cores. Specialized-hardware accelerators are typically designed in isolation, optimized for the algorithm they are implementing, and with limited consideration of the implications of their integration into a given SoC. However, the interaction between these accelerators and the memory hierarchy is critically important for their performance and the performance of the overall SoC. By leveraging our platform for rapid SoC prototyping, we analyze three models of coherence for loosely coupled accelerators from a system-level perspective. Originated in the world of embedded systems, the SoC has emerged as the main computation engine across a variety of computing-system classes. Examples of major SoC product families include the Apple A Series, the NVIDIA Tegra, the Qualcomm Snapdragon, and the recently announced Xilinx Everest. A state-of-the-art SoC combines many general-purpose processor cores with a growing number of accelerators, each offering a high-performance specializedhardware implementation of an algorithm (or a small class of algorithms). These accelerators are loosely coupled because they are located outside the cores and execute coarse-grain tasks on large datasets independently from them. 1 Considering also that it is desirable to reuse a loosely coupled accelerator across different SoCs, it is not surprising that its design is typically performed and evaluated in isolation. However, the system-level integration of an accelerator and its run-time interaction with the other SoC components has a critical influence on the performance it can deliver. For example, while an accelerator is often designed with the ideal assumption of a perfect balance between its memorybandwidth requirements and the bandwidth it can access in a system, the SoC reality involves memory contention and interconnect congestion.
“…Embedded Scalable Platforms (ESP) is a new approach to SoC design and programming that addresses these challenges by combining a flexible tile-based socketed architecture with a companion system-level design (SLD) methodology. 2 Each tile of an ESP instance can host a processor, I/O peripherals, system utilities, or accelerators, which are typically configurable but not programmable. The selection of the mix of tiles for a target application domain is the result of a design-space exploration guided by the SLD methodology.…”
The complexity of System-on-Chip (SoC) designs continues to grow as each SoC features an increasing variety of loosely coupled accelerators together with multiple processor cores. Specialized-hardware accelerators are typically designed in isolation, optimized for the algorithm they are implementing, and with limited consideration of the implications of their integration into a given SoC. However, the interaction between these accelerators and the memory hierarchy is critically important for their performance and the performance of the overall SoC. By leveraging our platform for rapid SoC prototyping, we analyze three models of coherence for loosely coupled accelerators from a system-level perspective. Originated in the world of embedded systems, the SoC has emerged as the main computation engine across a variety of computing-system classes. Examples of major SoC product families include the Apple A Series, the NVIDIA Tegra, the Qualcomm Snapdragon, and the recently announced Xilinx Everest. A state-of-the-art SoC combines many general-purpose processor cores with a growing number of accelerators, each offering a high-performance specializedhardware implementation of an algorithm (or a small class of algorithms). These accelerators are loosely coupled because they are located outside the cores and execute coarse-grain tasks on large datasets independently from them. 1 Considering also that it is desirable to reuse a loosely coupled accelerator across different SoCs, it is not surprising that its design is typically performed and evaluated in isolation. However, the system-level integration of an accelerator and its run-time interaction with the other SoC components has a critical influence on the performance it can deliver. For example, while an accelerator is often designed with the ideal assumption of a perfect balance between its memorybandwidth requirements and the bandwidth it can access in a system, the SoC reality involves memory contention and interconnect congestion.
“…To guarantee the security of such systems with DIFT, we need to implement a holistic approach: DIFT must be supported in both processors and accelerators. This ensures that (1) the tags are propagated from the processor cores to the 1 We assume that a hardware implementation of DIFT is available for the processor and the communication infrastructure. A equally valid alternative would be having a hybrid approach where the accelerators are protected in hardware while the software applications are protected by a software-based DIFT approach within the operating system (see Section IX for related work).…”
Section: Need Of a Holistic Dift Implementationmentioning
confidence: 99%
“…The execution time reported in these experiments corresponds to the time required by the accelerator to process the given workload in hardware. To measure the execution time for each combination of accelerator, burst size and tag offset, we leveraged the Embedded Scalable Platforms (ESP) methodology [1], [39] to design an SoC that includes a processor core (LEON3), a memory controller, and the specific accelerator. We ran these experiments on the FPGA by booting Linux on the processor core.…”
Section: Performance and Cost Analysismentioning
confidence: 99%
“…Dynamic information flow tracking (DIFT), also known as dynamic taint analysis in the literature, has been proposed as a promising security technique to protect systems against software attacks [16], [17]. DIFT is based on the observations that (1) it is impossible to prevent the injection of untrustworthy data in software applications (e.g., data coming from software users), and (2) it is very difficult to cover all the possible exploits that use such data. It is better to monitor, i.e., track, the suspicious data flows during the application execution to ensure that they are not exploited and do not cause a security violation.…”
Software-based attacks exploit bugs or vulnerabilities to get unauthorized access or leak confidential information. Dynamic information flow tracking (DIFT) is a security technique to track spurious information flows and provide strong security guarantees against such attacks. To secure heterogeneous systems, the spurious information flows must be tracked through all their components, including processors, accelerators (i.e., applicationspecific hardware components) and memories. We present PAGU-RUS, a flexible methodology to design a low-overhead shell circuit that adds DIFT support to accelerators. The shell uses a coarsegrain DIFT approach, thus not requiring to make modifications to the accelerator's implementation. We analyze the performance and area overhead of the DIFT shell on FPGAs and we propose a metric, called information leakage, to measure its security guarantees. We perform a design-space exploration to show that we can synthesize accelerators with different characteristics in terms of performance, cost and security guarantees. We also present a case study where we use the DIFT shell to secure an accelerator running on a embedded platform with a DIFT-enhanced RISC-V core.
“…According to ITRS predictions, future SoCs will be characterized by heavy reuse (more than 90% by 2020) of Intellectual Property (IP) blocks for reducing design cost and time-to-market [2]. To increase productivity and tackle design complexity, system designers will increasingly use High-Level Synthesis (HLS) to automatically generate specialized IP blocks in a suitable Hardware Description Language (HDL) [3], while integrating all the components with Electronic System Level (ESL) methodologies [4].…”
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.