Assaf Eisenman scite author profile

Assaf Eisenman

6Publications

86Citation Statements Received

59Citation Statements Given

How they've been cited

156

How they cite others

Affiliations

Meta (United States), Stanford University, Meta (Israel)

Publications

Order By: Most citations

Reducing DRAM footprint with NVM in Facebook

Eisenman

Gardner

AbdelRahman

et al. 2018

View full text Add to dashboard Cite

Popular SSD-based key-value stores consume a large amount of DRAM in order to provide high-performance database operations. However, DRAM can be expensive for data center providers, especially given recent global supply shortages that have resulted in increasing DRAM costs. In this work, we design a key-value store, MyNVM, which leverages an NVM block device to reduce DRAM usage, and to reduce the total cost of ownership, while providing comparable latency and queries-per-second (QPS) as MyRocks on a server with a much larger amount of DRAM. Replacing DRAM with NVM introduces several challenges. In particular, NVM has limited read bandwidth, and it wears out quickly under a high write bandwidth. We design novel solutions to these challenges, including using small block sizes with a partitioned index, aligning blocks post-compression to reduce read bandwidth, utilizing dictionary compression, implementing an admission control policy for which objects get cached in NVM to control its durability, as well as replacing interrupts with a hybrid polling mechanism. We implemented MyNVM and measured its performance in Facebook's production environment. Our implementation reduces the size of the DRAM cache from 96 GB to 16 GB, and incurs a negligible impact on latency and queries-per-second compared to MyRocks. Finally, to the best of our knowledge, this is the first study on the usage of NVM devices in a commercial data center environment.

show abstract

Software-hardware co-design for fast and scalable training of deep learning recommendation models

Mudigere

Hao

Huang

et al. 2022

View full text Add to dashboard Cite

Deep learning recommendation models (DLRMs) have been used across many business-critical services at Meta and are the single largest AI application in terms of infrastructure demand in its data-centers. In this paper, we present Neo, a software-hardware co-designed system for high-performance distributed training of large-scale DLRMs. Neo employs a novel 4D parallelism strategy that combines table-wise, row-wise, column-wise, and data parallelism for training massive embedding operators in DLRMs. In addition, Neo enables extremely high-performance and memoryefficient embedding computations using a variety of critical systems optimizations, including hybrid kernel fusion, software-managed caching, and quality-preserving compression. Finally, Neo is paired with ZionEX , a new hardware platform co-designed with Neo's 4D parallelism for optimizing communications for large-scale DLRM training. Our evaluation on 128 GPUs using 16 ZionEX nodes shows that Neo outperforms existing systems by up to 40× for training 12-trillion-parameter DLRM models deployed in production.

show abstract

Software-Hardware Co-design for Fast and Scalable Training of Deep Learning Recommendation Models

Mudigere¹,

Hao²,

Huang³

et al. 2021

Preprint

View full text Add to dashboard Cite

Parallel Graph Processing

Eisenman

Cherkasova

Magalhaes³

et al. 2016

View full text Add to dashboard Cite

Flashield: a Key-value Cache that Minimizes Writes to Flash

Eisenman¹,

Cidon²,

Pergament³

et al. 2017

Preprint

View full text Add to dashboard Cite

As its price per bit drops, SSD is increasingly becoming the default storage medium for cloud application databases. However, it has not become the preferred storage medium for key-value caches, even though SSD offers more than 10× lower price per bit and sufficient performance compared to DRAM. This is because keyvalue caches need to frequently insert, update and evict small objects. This causes excessive writes and erasures on flash storage, since flash only supports writes and erasures of large chunks of data. These excessive writes and erasures significantly shorten the lifetime of flash, rendering it impractical to use for key-value caches. We present Flashield, a hybrid key-value cache that uses DRAM as a "filter" to minimize writes to SSD. Flashield performs light-weight machine learning profiling to predict which objects are likely to be read frequently before getting updated; these objects, which are prime candidates to be stored on SSD, are written to SSD in large chunks sequentially. In order to efficiently utilize the cache's available memory, we design a novel in-memory index for the variable-sized objects stored on flash that requires only 4 bytes per object in DRAM. We describe Flashield's design and implementation and, we evaluate it on a real-world cache trace. Compared to state-of-theart systems that suffer a write amplification of 2.5× or more, Flashield maintains a median write amplification of 0.5× without any loss of hit rate or throughput.

show abstract

Parallel Graph Processing on Modern Multi-core Servers: New Findings and Remaining Challenges

Eisenman

Cherkasova²,

Magalhaes

et al. 2016

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Assaf Eisenman

Reducing DRAM footprint with NVM in Facebook

Software-hardware co-design for fast and scalable training of deep learning recommendation models

Software-Hardware Co-design for Fast and Scalable Training of Deep Learning Recommendation Models

Parallel Graph Processing

Flashield: a Key-value Cache that Minimizes Writes to Flash

Parallel Graph Processing on Modern Multi-core Servers: New Findings and Remaining Challenges

Contact Info

Product

Resources

About