Abstract-Power modeling and estimation has become one of the most defining aspects in designing modern embedded systems. In this context, DDR SDRAM memories contribute significantly to system power consumption, but lack accurate and generic power models. The most popular SDRAM power model provided by Micron, is found to be inaccurate or insufficient for several reasons. First, it does not consider the power consumed when transitioning to power-down and self-refresh modes. Second, it employs the minimal timing constraints between commands from the SDRAM datasheets and not the actual duration between the commands as issued by an SDRAM memory controller. Finally, without adaptations, it can only be applied to a memory controller that employs a close-page policy and accesses a single SDRAM bank at a time. These critical issues with Micron's power model impact the accuracy and the validity of the power values reported by it and resolving them, forms the focus of our work.In this paper, we propose an improved SDRAM power model that estimates power consumption during the state transitions to power-saving states, employs an SDRAM command trace to get the actual timings between the commands issued and is generic and applicable to all DDRx SDRAMs and all memory controller policies and all degrees of bank interleaving. We quantitatively compare the proposed model against the unmodified Micron model on power and energy for DDR3-800. We show differences of up to 60% in energy-savings for the precharge power-down mode for a power-down duration of 14 cycles and up to 80% for the self-refresh mode for a self-refresh duration of 560 cycles.
Systems on chip (SOC) contain multiple concurrent applications with different time criticality (firm, soft, non real-time). As a result, they are often developed by different teams or companies, with different models of computation (MOC) such as dataflow, Kahn process networks (KPN), or time-triggered (TT). SOC functionality and (real-time) performance is verified after all applications have been integrated. In this paper we propose the CompSOC platform and design flows that offers a virtual execution platform per application, to allow independent design, verification, and execution . We introduce the composability and predictability concepts, why they help, and how they are implemented in the different resources of the CompSOC architecture. We define a design flow that allows real-time cyclo-static dataflow (CSDF) applications to be automatically mapped, verified, and executed. Mapping and analysis of KPN and TT applications is not automated but they do run composably in their allocated virtual platforms. Although most of the techniques used here have been published in isolation, this paper is the first comprehensive overview of the CompSOC approach. Moreover, three new case studies illustrate all claimed benefits: 1) An example firm-real-time CSDF H.263 decoder is automatically mapped and verified. 2) Applications with different models of computation (CSDF and TT) run composably. 3) Adaptive soft-real-time applications execute composably and can hence be verified independently by simulation.
Abstract-Reducing power/energy consumption is an important goal for all computer systems, from servers to battery-driven hand-held devices. To achieve this goal, the energy consumption of all system components needs to be reduced. One of the most power-hungry components is the off-chip DRAM, even when it is idle. DRAMs support different power-saving modes, such as self-refresh and power-down, but employing them every time the DRAM is idle, reduces performance due to their power-up latencies. The self-refresh mode offers large power savings, but incurs a long power-up latency. The power-down mode, on the other hand, has a shorter power-up latency, but provides lower power savings.In this paper, we propose and evaluate a novel power-saving policy that combines the best of both power-saving modes in order to achieve significant power reductions with a marginal performance penalty. To accomplish this, we use a history-based predictor to forecast the duration of an idle period and then either employ self-refresh, or power-down, or a combination of both power saving modes. Significant refinements are made to the predictor to maximize the energy savings and minimize the performance penalty. The presented policy is evaluated using several applications from the multimedia domain and the experimental results show that it reduces the total DRAM energy consumption between 68.8% and 79.9% at a negligible performance penalty between 0.3% and 2.2%.
DRAM vendors provide pessimistic current measures in memory datasheets to account for worst-case impact of process variations and to improve their production yield, leading to unrealistic power consumption estimates. In this paper, we first demonstrate the possible effects of process variations on DRAM performance and power consumption by performing Monte-Carlo simulations on a detailed DRAM cross-section. We then propose a methodology to empirically determine the actual impact for any given DRAM memory by assessing its performance characteristics during the DRAM calibration phase at system boot-time, thereby enabling its optimal use at run-time. We further employ our analysis on Micron's 2Gb DDR3-1600-x16 memory and show considerable over-estimation in the datasheet measures and the energy estimates (up to 28%), by using realistic current measures for a set of MediaBench applications.
JEDEC recently introduced its new standard for 3Dstacked Wide I/O DRAM memories, which defines their architecture, design, features and timing behavior. With improved performance/power trade-offs over previous generation DRAMs, Wide I/O DRAMs provide an extremely energy-efficient green memory solution required for next-generation embedded and highperformance computing systems. With both industry and academia pushing to evaluate and employ these highly anticipated memories, there is an urgent need for an accurate power model targeting Wide I/O DRAMs that enables their efficient integration and energy management in DRAM stacked SoC architectures.In this paper, we present the first system-level power model of 3D-stacked Wide I/O DRAM memories that is almost as accurate as detailed circuit-level power models of 3D-DRAMs. To verify its accuracy, we experimentally compare its power and energy estimates for different memory workloads and operations against those of a circuit-level 3D-DRAM power model and show less than 2% difference between the two sets of estimates.
-In this paper the performance of meta-heuristics algorithms such as GA (Genetic Algorithm), DE (Differential Evolution), PSO (Particle Swarm Optimization) and SA (Simulated Annealing) for the problem of TTC enhancement using FACTS devices are compared. In addition to that in the assessment procedure of TTC two novel techniques are proposed. First the optimization algorithm which is used for TTC enhancement is simultaneously used for assessment of TTC. Second the power flow is done using Broyden -Shamanski method with Sherman -Morrison formula (BSS). The proposed approach is tested on WSCC 9 bus, IEEE 118 bus test systems and the results are compared with the conventional Repeated Power Flow (RPF) using Newton Raphson (NR) method which indicates that the proposed method provides better TTC enhancement and computational efficacy than the conventional procedure.
Abstract-Real-time safety-critical systems should provide hard bounds on an applications' performance. SDRAM controllers used in this domain should therefore have a bounded worst-case bandwidth, response time, and power consumption. Existing works on real-time SDRAM controllers only consider a narrow range of memory devices, and do not evaluate how their schedulers' performance varies across memory generations, nor how the scheduling algorithm influences power usage. The extent to which the number of banks used in parallel to serve a request impacts performance is also unexplored, and hence there are gaps in the tool set of a memory subsystem designer, in terms of both performance analysis, and configuration options. This article introduces a generalized close-page memory command scheduling algorithm that uses a variable number of banks in parallel to serve a request. To reduce the schedule length for DDR4 memories, we exploit bank grouping through a pairwise bank-group interleaving scheme. The algorithm is evaluated using an ILP formulation, and provides schedules of optimal length for most of the considered LPDDR, DDR2, DDR3, LPDDR2, LPDDR3 and DDR4 devices. We derive the worst-case bandwidth, power and execution time for the same set of devices, and discuss the observed trade-offs and trends in the scheduler-configuration design space based on these metrics, across memory generations.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.