Power efficient design of real-time systems based on programmable processors becomes more important as system functionality is increasingly realized through software. This paper presents a powerefficient version of a widely used fixed priority scheduling method. The method yields a power reduction by exploiting slack times, both those inherent in the system schedule and those arising from variations of execution times. The proposed run-time mechanism is simple enough to be implemented in most kernels. Experimental results show that the proposed scheduling method obtains a significant power reduction across several kinds of applications.
We report the growth of Ba1−xLaxSnO3 (x = 0.00, 0.005, 0.01, 0.02, and 0.04) thin films on the insulating BaSnO3(001) substrate by pulsed laser deposition. The insulating BaSnO3 substrates were grown by the Cu2O-CuO flux, in which the molar fraction of KClO4 was systematically increased to reduce electron carriers and thus induce a doping induced metal-insulator transition, exhibiting a resistivity increase from ∼10−3 to ∼1012 Ω cm at room temperature. We find that all the Ba1−xLaxSnO3 films are epitaxial, showing good in-plane lattice matching with the substrate as confirmed by X-ray reciprocal space mappings and transmission electron microscopy studies. The Ba1−xLaxSnO3 (x = 0.005–0.04) films showed degenerate semiconducting behavior, and the electron mobility at room temperature reached 100 and 85 cm2 V−1 s−1 at doping levels 1.3 × 1020 and 6.8 × 1019 cm−3, respectively. This work demonstrates that thin perovskite stannate films of high quality can be grown on the BaSnO3(001) substrates for potential applications in transparent electronic devices.
8-Cl-cAMP (8-chloro-cyclic AMP), which induces differentiation, growth inhibition and apoptosis in various cancer cells, has been investigated as a putative anti-cancer drug. Although we reported that 8-Cl-cAMP induces growth inhibition via p38 mitogen-activated protein kinase (MAPK) and a metabolite of 8-Cl-cAMP, 8-Cl-adenosine mediates this process, the action mechanism of 8-Cl-cAMP is still uncertain. In this study, it was found that 8-Cl-cAMP-induced growth inhibition is mediated by AMP-activated protein kinase (AMPK). 8-Cl-cAMP was shown to activate AMPK, which was also dependent on the metabolic degradation of 8-Cl-cAMP. A potent agonist of AMPK, 5-aminoimidazole-4-carboxamide ribonucleoside (AICAR) could also induce growth inhibition and apoptosis. To further delineate the role of AMPK in 8-Cl-cAMP-induced growth inhibition and apoptosis, we used two approaches: pharmacological inhibition of the enzyme with compound C and expression of a dominant negative mutant (a kinase-dead form of AMPKalpha2, KD-AMPK). AICAR was able to activate p38 MAPK and pre-treatment with AMPK inhibitor or expression of KD-AMPK blocked this p38 MAPK activation. Cell growth inhibition was also attenuated. Furthermore, p38 MAPK inhibitor attenuated 8-Cl-cAMP- or AICAR-induced growth inhibition but had no effect on AMPK activation. These results demonstrate that 8-Cl-cAMP induced growth inhibition through AMPK activation and p38 MAPK acts downstream of AMPK in this signaling pathway.
This paper deals with power minimization problem for datadominated applications based on a novel concept called partially guarded computation. We divide a functional unit into two parts -MSP (Most Significant Part) and LSP (Least Significant Part) -and allow the functional unit to perform only the LSP computation if the range of output data can be covered by LSP. We dynamically disable MSP computation to remove unnecessary transitions thereby reducing power consumption. We also propose a systematic approach for determining optimal location of the boundary between the two parts during high-level synthesis. Experimental results show about 10∼44% power reduction with about 30∼36% area overhead and less than 3% delay overhead in functional units.
Processing-in-memory (PIM) is rapidly rising as a viable solution for the memory wall crisis, rebounding from its unsuccessful attempts in 1990s due to practicality concerns, which are alleviated with recent advances in 3D stacking technologies. However, it is still challenging to integrate the PIM architectures with existing systems in a seamless manner due to two common characteristics: unconventional programming models for in-memory computation units and lack of ability to utilize large on-chip caches.In this paper, we propose a new PIM architecture that (1) does not change the existing sequential programming models and (2) automatically decides whether to execute PIM operations in memory or processors depending on the locality of data. The key idea is to implement simple in-memory computation using compute-capable memory commands and use specialized instructions, which we call PIM-enabled instructions, to invoke in-memory computation. This allows PIM operations to be interoperable with existing programming models, cache coherence protocols, and virtual memory mechanisms with no modification. In addition, we introduce a simple hardware structure that monitors the locality of data accessed by a PIM-enabled instruction at runtime to adaptively execute the instruction at the host processor (instead of in memory) when the instruction can benefit from large on-chip caches. Consequently, our architecture provides the illusion that PIM operations are executed as if they were host processor instructions.We provide a case study of how ten emerging data-intensive workloads can benefit from our new PIM abstraction and its hardware implementation. Evaluations show that our architecture significantly improves system performance and, more importantly, combines the best parts of conventional and PIM architectures by adapting to data locality of applications.
-As process technology goes into deep submicron range, interconnect delay becomes dominant among overall system delay, occupying most of the system clock cycle time. Interconnect delay is now a crucial factor that needs to be considered even during high-level synthesis. In this paper, we propose a concurrent scheduling and binding algorithm that takes interconnect delay into account. We first define our distributed target architecture, which minimizes the effect of interconnect delay on clock cycle time. We no longer assume that interconnect delay between functional units is a part of one clock cycle. Interconnect delay can span over multiple clock cycles. We incorporate the concept of multi-cycle interconnect delay into scheduling and binding process, to reduce the critical path length and therefore the system latency. We show that by introducing interconnect delay, we can obtain latency improvement of up to 54 % and of 37% on the average.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.