Ehsan Atoofian scite author profile

Ehsan Atoofian

5Publications

28Citation Statements Received

99Citation Statements Given

How they've been cited

How they cite others

100

Affiliations

Lakehead University, University of Victoria, University of Tehran

Publications

Order By: Most citations

Reducing shift penalty in Domain Wall Memory through register locality

Atoofian¹

2015

View full text Add to dashboard Cite

General-purpose graphics processing units (GPGPUs) have the ability to execute hundreds to thousands of threads simultaneously. Extreme multithreading requires a large register file to hold state of executing threads and facilitate context switching. As feature size reduces, power consumption in the large register file becomes a major concern.In this work, we exploit Domain Wall Memory (DWM) which is a spin-based memory to reduce power consumption in register file. DWM is a promising technology and offers non-volatility, high energy efficiency, and high density by storing several bits into the domains of a ferromagnetic wire. However, despite of favourable properties of DMW over SRAM technology, DWM poses a unique challenge that the bits must be accessed serially through shift operations, leading to variable and potentially higher access latencies. To address this challenge, we propose a new predictive shift policy. In this policy, we exploit register locality across threads and predict source and destination operands of instructions. We record history of registers accessed by instructions and shift magnetic domains of DWM tracks for subsequent instructions, speculatively. Over a wide range of applications from NVIDIA CUDA SDK, ISPASS, and Rodinia, our predictive scheme achieves dramatic energy saving over an SRAM register file while changing performance negligibly.

show abstract

Automatic Optimization of Software Transactional Memory Through Linear Regression and Decision Tree

Xiao

Atoofian

et al. 2015

View full text Add to dashboard Cite

Improving Performance of Transactional Applications through Adaptive Transactional Memory

Jeyakumaran

Atoofian

Xiao

et al. 2016

View full text Add to dashboard Cite

With the rise of chip multiprocessors (CMPs), it is necessary to use parallel programming to exploit computational power of CMPs. Traditionally, lock-based mechanisms have been used to synchronize shared variables in parallel programs. However, with the complexity associated with locks, writing a correct parallel program is a huge burden for programmers. As an alternative, Transactional Memory (TM) is gaining momentum as a parallel programming model for multi-core processors. TM provides programmers with an atomic construct (transaction), which can be used to guarantee atomicity of accesses to shared variables, as the synchronization is handled through the underlying system. Transactional memory comes in two variants: Software transaction memory (STM) and Hardware transaction memory (HTM). Both STM and HTM systems have advantages and disadvantages that either enhance or penalize performance in transactional applications. In this thesis, the focus is on implementing an adaptive system that exploits both STM and HTM at transaction granularity. The goal is to achieve performance gain by incorporating the benefits of both TM systems. A synchronization technique is developed to seamlessly switch between HTM and STM based on the characteristics of a transaction. We exploit decision tree to predict the optimum system for each transaction in a given application. The decision tree is a form of supervised machine learning to classify transactions based on parameters such as transaction size, transaction write ratio, etc. From the evaluations using STAMP, NAS, and DiscoPoP benchmark suites, the proposed adaptive system is able to improve speed of transactional applications by 20.82% on average.

show abstract

Using supplier locality in power-aware interconnects and caches in chip multiprocessors

Atoofian

Baniasadi

2008

Journal of Systems Architecture

View full text Add to dashboard Cite

Improving performance of transactional memory through machine learning

Xiao

Jeyakumaran

Atoofian

et al. 2017

Concurrency and Computation

View full text Add to dashboard Cite

Summary Transactional memory (TM) is a programming paradigm that facilitates parallel programming for multi‐core processors. In the last few years, some chip manufacturers provided hardware support for TM to reduce runtime overhead of Software Transactional Memory (STM). In this work, we offer two optimization techniques for TMs. The first technique focuses on Restricted Transactional Memory (RTM) in Intel's Haswell processor and shows that while in some applications, RTM improves performance over STM, in some others, it falls behind STM. We exploit this variability and propose an adaptive technique that switches between RTM and STM, statically. The second technique focuses on the overhead of TM and enhances the speed of the adaptive system. In particular, we focus on the size of transactions and improve performance by changing the transaction size. Optimizing the transaction size manually is a time‐consuming process and requires significant software engineering effort. We use a combination of Linear Regression (LR) and decision tree to decide on the transaction size, automatically. We evaluate our optimization techniques using a set of benchmarks from NAS, DiscoPoP, and STAMP benchmark suites. Our experimental results reveal that our optimization techniques are able to improve the performance of TM programs by 9% and energy‐delay by 15%, on average.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Ehsan Atoofian

Reducing shift penalty in Domain Wall Memory through register locality

Automatic Optimization of Software Transactional Memory Through Linear Regression and Decision Tree

Improving Performance of Transactional Applications through Adaptive Transactional Memory

Using supplier locality in power-aware interconnects and caches in chip multiprocessors

Improving performance of transactional memory through machine learning

Contact Info

Product

Resources

About