While Processing-in-Memory has been investigated for decades, it has not been embraced commercially. A number of emerging technologies have renewed interest in this topic. In particular, the emergence of 3D stacking and the imminent release of Micron's Hybrid Memory Cube device have made it more practical to move computation near memory. However, the literature is missing a detailed analysis of a killer application that can leverage a Near Data Computing (NDC) architecture. This paper focuses on in-memory MapReduce workloads that are commercially important and are especially suitable for NDC because of their embarrassing parallelism and largely localized memory accesses. The NDC architecture incorporates several simple processing cores on a separate, non-memory die in a 3D-stacked memory package; these cores can perform Map operations with efficient memory access and without hitting the bandwidth wall. This paper describes and evaluates a number of key elements necessary in realizing efficient NDC operation: (i) low-EPI cores, (ii) long daisy chains of memory devices, (iii) the dynamic activation of cores and SerDes links. Compared to a baseline that is heavily optimized for MapReduce execution, the NDC design yields up to 15X reduction in execution time and 18X reduction in system energy.
DRAM vendors have traditionally optimized the cost-perbit metric, often making design decisions that incur energy penalties. A prime example is the overfetch feature in DRAM, where a single request activates thousands of bitlines in many DRAM chips, only to return a single cache line to the CPU. The focus on cost-per-bit is questionable in modern-day servers where operating costs can easily exceed the purchase cost. Modern technology trends are also placing very different demands on the memory system: (i) queuing delays are a significant component of memory access time, (ii) there is a high energy premium for the level of reliability expected for business-critical computing, and (iii) the memory access stream emerging from multi-core systems exhibits limited locality. All of these trends necessitate an overhaul of DRAM architecture, even if it means a slight compromise in the cost-per-bit metric.This paper examines three primary innovations. The first is a modification to DRAM chip microarchitecture that retains the traditional DDRx SDRAM interface. Selective Bitline Activation (SBA) waits for both RAS (row address) and CAS (column address) signals to arrive before activating exactly those bitlines that provide the requested cache line. SBA reduces energy consumption while incurring slight area and performance penalties. The second innovation, Single Subarray Access (SSA), fundamentally re-organizes the layout of DRAM arrays and the mapping of data to these arrays so that an entire cache line is fetched from a single subarray. It requires a different interface to the memory controller, reduces dynamic and background energy (by about 6X), incurs a slight area penalty (4%), and can even lead to performance improvements (up to 10%) by reducing queuing delays. The third innovation further penalizes the cost-perbit metric by adding a checksum feature to each cache line. This checksum error-detection feature can then be used to Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. ISCA '10, June 19-23, 2010, Saint-Malo, France. Copyright 2010 build stronger RAID-like fault tolerance, including chipkilllevel reliability. Such a technique is especially crucial for the SSA architecture where the entire cache line is localized to a single chip. This DRAM chip microarchitectural change leads to a dramatic reduction in the energy and storage overheads for reliability. The proposed architectures will also apply to other emerging memory technologies (such as resistive memories) and will be less disruptive to standards, interfaces, and the design flow if they can be incorporated into first-generation designs.
Abstract. Bovine spongiform encephalopathy (BSE) is a transmissible spongiform encephalopathy of cattle, first detected in 1986 in the United Kingdom and subsequently in other countries. It is the most likely cause of variant Creutzfeldt-Jakob disease (vCJD) Sc from case 1 showed molecular features similar to typical BSE isolates, whereas PrP Sc from case 2 revealed an unusual molecular PrP Sc pattern: molecular mass of the unglycosylated and monoglycosylated isoform was higher than that of typical BSE isolates and case 2 was strongly labeled with antibody P4, which is consistent with a higher molecular mass. Sequencing of the prion protein gene of both BSE-positive animals revealed that the sequences of both animals were within the range of the prion protein gene sequence diversity previously reported for cattle.
Modern processors such as Tilera's Tile64, Intel's Nehalem, and AMD's Opteron are migrating memory controllers (MCs) on-chip, while maintaining a large, flat memory address space. This trend to utilize multiple MCs will likely continue and a core or socket will consequently need to route memory requests to the appropriate MC via an inter-or intra-socket interconnect fabric similar to AMD's HyperTransport , or Intel's Quick-Path Interconnect . Such systems are therefore subject to non-uniform memory access (NUMA) latencies because of the time spent traveling to remote MCs. Each MC will act as the gateway to a particular piece of the physical memory. Data placement will therefore become increasingly critical in minimizing memory access latencies.To date, no prior work has examined the effects of data placement among multiple MCs in such systems. Future chip-multiprocessors are likely to comprise multiple MCs and an even larger number of cores. This trend will increase the memory access latency variation in these systems. Proper allocation of workload data to the appropriate MC will be important in reducing the latency of memory service requests. The allocation strategy will need to be aware of queuing delays, on-chip latencies, and row-buffer hit-rates for each MC. In this paper, we propose dynamic mechanisms that take these factors into account when placing data in appropriate slices of the physical memory. We introduce adaptive first-touch page placement, and dynamic page-migration mechanisms to reduce DRAM access delays for multi-MC systems. These policies yield average performance improvements of 17% for adaptive first-touch pageplacement, and 35% for a dynamic page-migration policy.
A large body of work demonstrates income-related disparities in access to coordinated preventive care in patients with diabetes and other chronic conditions. Much less information exists on associations between poverty and consequential negative health outcomes. Few studies have assessed geographic patterns linking household incomes to major, preventable complications of chronic diseases. Using statewide facility discharge data for California during 2009, we identified 7,973 lower extremity amputations in 6,828 diabetic adults. We mapped amputation events based on residential zip codes, and used US census data to produce corresponding maps of poverty rate. Comparisons of the maps show amputation “hotspots” in lower income urban and rural regions of California. Prevalence-adjusted amputation rates varied ten-fold between high-income and low-income regions. While our analysis does not support detailed causal inferences, our method for mapping complication “hot spots” using existing public data sources may help target interventions to communities most in need.
Safety-net hospitals rely on Disproportionate Share Hospital (DSH) payments to help cover uncompensated care costs and underpayments by Medicaid (known as Medicaid shortfalls). The Affordable Care Act (ACA) anticipates that insurance expansions will increase safety-net hospitals’ revenues, and reduces DSH payments accordingly. We examined the impact of the ACA’s Medicaid DSH reductions on California public hospitals’ financial stability by estimating how total DSH costs (uncompensated care costs and Medicaid shortfalls) will change as a result of insurance expansions and the offsetting DSH reductions. Decreases in uncompensated care costs due to the ACA insurance expansion may not match the ACA’s DSH reductions because of the high number of residually uninsured patients, low Medicaid reimbursement, and medical cost inflation. Taking these three factors into account, we estimate that California public hospitals’ total DSH costs will increase from $2.044 billion in 2010 to $2.363 billion in 2019, with unmet DSH costs of $1.381 billion to $1.537 billion.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.