Isak Edo scite author profile

We present FPRaker, a processing element for composing training accelerators. FPRaker processes several floatingpoint multiply-accumulation operations concurrently and accumulates their result into a higher precision accumulator. FPRaker boosts performance and energy efficiency during training by taking advantage of the values that naturally appear during training. It processes the significand of the operands of each multiply-accumulate as a series of signed powers of two. The conversion to this form is done on-the-fly. This exposes ineffectual work that can be skipped: values when encoded have few terms and some of them can be discarded as they would fall outside the range of the accumulator given the limited precision of floatingpoint. FPRaker also takes advantage of spatial correlation in values across channels and uses delta-encoding off-chip to reduce memory footprint and bandwidth. We demonstrate that FPRaker can be used to compose an accelerator for training and that it can improve performance and energy efficiency compared to using optimized bit-parallel floating-point units under isocompute area constraints. We also demonstrate that FPRaker delivers additional benefits when training incorporates pruning and quantization. Finally, we show that FPRaker naturally amplifies performance with training methods that use a different precision per layer.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Isak Edo

GOBO: Quantizing Attention-Based NLP Models for Low Latency and Energy Efficient Inference

TensorDash: Exploiting Sparsity to Accelerate Deep Neural Network Training

Mocktails: Capturing the Memory Behaviour of Proprietary Mobile Architectures

FPRaker: A Processing Element For Accelerating Neural Network Training

ShapeShifter

FPRaker: A Processing Element For Accelerating Neural Network Training

Contact Info

Product

Resources

About