Long interconnects are becoming an increasingly important problem from both power and performance perspectives. This motivates designers to adopt on-chip network-based communication infrastructures and three-dimensional (3D) designs where multiple device layers are stacked together. Considering the current trends towards increasing use of chip multiprocessing, it is timely to consider 3D chip multiprocessor design and memory networking issues, especially in the context of data management in large L2 caches. The overall goal of this paper is to study the challenges for L2 design and management in 3D chip multiprocessors. Our first contribution is to propose a router architecture and a topology design that makes use of a network architecture embedded into the L2 cache memory. Our second contribution is to demonstrate, through extensive experiments, that a 3D L2 memory architecture generates much better results than the conventional two-dimensional (2D) designs under different number of layers and vertical (inter-wafer) connections. In particular, our experiments show that a 3D architecture with no dynamic data migration generates better performance than a 2D architecture that employs data migration. This also helps reduce power consumption in L2 due to a reduced number of data movements.
Soft errors induced by terrestrial radiation are becoming a significant concern in architectures designed in newer technologies. If left undetected, these errors can result in catastrophic consequences or costly maintenance problems in different embedded applications. In this article, we focus on utilizing the compiler's help in duplicating instructions for error detection in VLIW datapaths. The instruction duplication mechanism is further supported by a hardware enhancement for efficient result verification, which avoids the need of additional comparison instructions. In the proposed approach, the compiler determines the instruction schedule by balancing the permissible performance degradation and the energy constraint with the required degree of duplication. Our experimental results show that our algorithms allow the designer to perform trade-off analysis between performance, reliability, and energy consumption.
The problem attacked in this paper is one of automatically mapping an application onto a Network-on-Chip (NoC) based chip multiprocessor (CMP) architecture in a locality-aware fashion. The proposed compiler approach has four major steps: task scheduling, processor mapping, data mapping, and packet routing. In the first step, the application code is parallelized and the resulting parallel threads are assigned to virtual processors. The second step implements a virtual processor-to-physical processor mapping. The goal of this mapping is to ensure that the threads that are expected to communicate frequently with each other are assigned to neighboring processors as much as possible. In the third step, data elements are mapped to memories attached to CMP nodes. The main objective of this mapping is to place a given data item into a node which is close to the nodes that access it. The last step of our approach determines the paths (between memories and processors) for data to travel in an energy efficient manner. In this paper, we describe the compiler algorithms we implemented in detail and present an experimental evaluation of the framework. In our evaluation, we test our entire framework as well as the impact of omitting some of its steps. This experimental analysis clearly shows that the proposed framework reduces energy consumption of our applications significantly (27.41% on average over a pure performance oriented application mapping strategy) as a result of improved locality of data accesses.
The electrochemical reduction processes of Sb(III) on Au electrode were studied by cyclic voltammetry (CV) and electrochemical impedance spectroscopy (EIS) electrochemical measurements. The CV analyses reveal that the electrochemical reduction of
SbnormalO+
on Au electrode includes three parts: the irreversible reduction of adsorbed
SbnormalO+
, the reversible underpotential deposition (UPD) of dissociative
SbnormalO+
, and the irreversible overpotential deposition of dissociative
SbnormalO+
. Combining the analyses about the redox behaviors of
SbnormalO+
as well as the CV measuring results, the reduction of Sb(III) to
Sb0
in the presence of the complex reagent (tartaric acid
normalC4normalH4normalO62−
) can be conjectured as comprised of four processes: the reduction of adsorbed
SbnormalO+
to
Sb0
, the UPD of dissociative
SbnormalO+
, the reduction of absorbed
[Sb2(normalC4normalH4normalO6)]2+
to
Sb0
, and the reduction of dissociative
[Sb2(normalC4normalH4normalO6)]2+
to
Sb0
. What can also be conjectured by further EIS studies is that the three electrons are not gained simultaneously during the formation of
Sb0
in these four processes.
In this work, we experiment with complier-directed instruction duplication to detect soft errors in VLIW datapaths . In the proposed approach, the compiler determines the instruction schedule by balancing the permissible performance degradation with the required degree of duplication. Our experimental results show that our algorithms allow the designer to perform tradeoff analysis between performance and reliability.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.