New emerging embedded system platforms in the realm of highthroughput multimedia, imaging, and signal processing will consist of multiple microprocessors and reconfigurable components. One of the major problems is how to program these platforms in a systematic and automated way so as to satisfy the performance need of applications executed on these platforms.In this paper, we present our system design approach as an efficient solution to this programming problem. We show how for an application written in Matlab, a Kahn Process Network specification can automatically be derived and systematically mapped onto a target platform composed of a microprocessor and an FPGA. Furthermore, we illustrate how the mapping approach is applied on a real-life example, namely an M-JPEG encoder.
In this paper we present an approach for quantitative analysis of application-specific dataflow architectures. The approach allows the designer to rate design alternatives in a quantitative 1: IntroductionIn the application domain of real-time video, the required processing power is in the order of hundreds of Risc-like operations per pixel, while the data rate of pixel streams is in the range of 10 to 100 Msamples per second. Consequently architectures are needed that perform billions of operations per second and have an internal communication bandwidth of Gbytes per second.In the application domain of real-time video we focus on dedicated architectures that support the concept of streams [17] and achieve the required performance by exploiting the inherent parallelism of the applications on domain-specific, coarse-grain processors, with limited internal flexibility (i.e. weakly programmable). An example of such a domain-specific architecture is given in figure 1. The architecture consists of different dedicated application-specific coarse-grain processors that operate independently of each other on data-streams. These streams are exchanged between the coarse-grain processors via a communication network and is controlled by some global controller. These kinds of architectures are typically embedded in a larger system that also contains memory and a general purpose processor, e.g. a Risc processor.In the design of these architectures, many choices have to be made. In this paper we present a simulation environment that aids the designer in making these choices based on quantitative information. In section 2 we present our problem statement. A solution approach is given in section 3. In section 4 we review related work of quantitative evaluation of design alternatives. The solution approach is further detailed for application-specific dataflow architectures in the following sections. In
Abstract-New heterogeneous multiprocessor platforms are emerging that are typically composed of loosely coupled components that exchange data using programmable interconnections. The components can be CPUs or DSPs, specialized IP cores, reconfigurable units, or memories. To program such platform, we use the Process Network (PN) model of computation. The localized control and distributed memory are the two key ingredients of a PN allowing us to program the platforms. The localized control matches the loosely coupled components and the distributed memory matches the style of interaction between the components. To obtain applications in a PN format, we have built the Compaan compiler that translates affine nestedloop programs into functionally equivalent PNs. In this paper, we describe a novel analytical translation procedure we use in our compiler that is based on integer linear programming. The translation procedure consists of four main steps and we will present each step by describing the main idea involved, followed by a representative example.
The Compaan compiler framework automates the transformation of DSP applications written in Matlab into Kahn Process Networks (KPNsThe Compaan framework [8] automatically transforms digital signal processing applications, written in a subset of Matlab, into Kahn Process Networks. These KPNs express the signal processing applications in a parallel distributed way making them more suitable for mapping onto parallel architectures. These networks can be converted to VHDL and quickly synthesized to FPGAs [5] or mapped onto some parallel signal processing architectures [9] at a high level of abstraction to obtain first-order performance numbers.The simplest instance of a Kahn Process Network is a Producer process that communicates with a Consumer process over an unbounded FIFO channel in which the Consumer reads data from the FIFO using a blocking read. A problem emerges if the order data is produced is different from the order data is consumed. Such situation leads to functional incorrect evaluation of the network and moreover it may lead to dead-lock. To avoid such situation, it is necessary to introduce a reordering mechanism that allows the Consumer to consume the data in the correct way. This paper presents a compile time approach to determine for a derived process network whether FIFOs are sufficient or additional reordering mechanisms are needed. For the case when a reordering mechanism is required, we also provide its model.We assume that the process networks are derived using the Compaan toolset from nested loop programs written in Matlab. The toolset consists of three tools. The first tool transforms the initial 1
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.