Coarse-grained reconfigurable architecture typically has an array of processing elements which are controlled by a centralized unit. This makes it difficult to execute programs having control divergence among PEs without predication. However, conventional predication techniques have a negative impact on both performance and power consumption due to longer instruction words and unnecessary instruction-fetching decoding nullifying steps. This article reveals performance and power issues in predicated execution which have not been well-addressed yet. Furthermore, it proposes fast and power-efficient predication mechanisms. Experiments conducted through gate-level simulation show that our mechanism improves energy-delay product by 11.9% to 23.8% on average.
The era of the Internet of Things (IoT) is upon us. In this era, minimizing power consumption becomes a primary concern of system-on-chip designers. Ultra-low power (ULP) VLSI circuits have been receiving considerable interest from both academia and industry as the best-suited techniques for IoT devices, which can take full advantage of power-saving that voltage scaling potentially achieves. Consequently, research on ULP designs has begun to yield tangible outcomes, namely ULP circuits. However, little attention has been paid to ULP Networkon-Chip (NoC), although the NoC is an essential of the ULP chips, and its power consumption accounts for a significant portion of the total power. This paper focuses on ULP NoCs, and presents a new power management method that exploits delay vs. temperature characteristics of ULP circuits. Recent studies on ULP circuits show that delay vs. temperature characteristics are fundamentally different from normal circuits, i.e., the delay of the ULP circuits implemented in state-of-the-art bulk CMOS operating at low supply voltages or in FinFET technologies decreases with increasing temperature, a phenomenon known as the temperature effect inversion (TEI). Starting with an intuition that at a certain temperature point, power savings without performance penalty can be achieved by increasing the router frequency to create the opportunity to turn off some routers in ULP NoCs, or by decreasing the NoC supply voltage level, an optimization method is presented to maximize the power savings with minor performance penalty. To validate the proposed method, a concrete ULP NoC simulator (TEI-Noxim) has been developed. Experimental results demonstrate that TEI-aware NoC achieves an average of 36.0% power reduction over 21 applications.
Coarse-grained reconfigurable array is a very attractive architecture from the viewpoint of performance and flexibility. However, because the performance improvement is achieved by exploiting parallelism, the architecture is typically poor at handling control flow, which is sequential in nature. There have been many attempts to overcome this problem by using predicated execution techniques; however, they do not support all types of control flow or suffer from performance degradation in doing so. In addition, predicated execution schemes in general require a longer execution time because both the if-and else-paths are always executed. This paper proposes advanced predicated execution techniques that can handle and accelerate all types of control flow with only 2% hardware overhead. These techniques can also be easily extended to general SIMD machines. We implemented these techniques on a coarsegrained reconfigurable array architecture and verified its functionality and effectiveness by accelerating an H.264 deblocking filter, a kernel which is both data-and controlintensive. The results show that the proposed approach achieves up to 43% improvement in execution time compared to speculation by sacrificing 76% code size, and 24% improvement in execution time compared to the previous full predication approach, with a smaller code size.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.