Yunho Jin scite author profile

Many cloud service providers employ specialized hardware accelerators, called neural processing units (NPUs), to accelerate deep neural networks (DNNs). An NPU scheduler is responsible for scheduling incoming user requests and required to satisfy the two, often conflicting, optimization goals: maximizing system throughput and satisfying quality-of-service (QoS) constraints (e.g., deadlines) of individual requests. We propose Layerweaver+, a low-cost layer-wise DNN scheduler for NPUs, which provides both high system throughput and minimal QoS violations. For a serving scenario based on the industry-standard MLPerf inference benchmark, Layerweaver+ significantly improves the system throughput by up to 266.7% over the baseline scheduler serving one DNN at a time.

show abstract

A simple algorithm for determining of movement duration in task space without violating joint angle constraints

Jin

Choi²

View full text Add to dashboard Cite

The Evolutionary Emergence of Neural Organization in a Hydra-like Animat

Jones¹,

Jin²,

Sendhoff³

et al. 2009

Front. Comput. Neurosci.

View full text Add to dashboard Cite

Architecting a Flash-Based Storage System for Low-Cost Inference of Extreme-Scale DNNs

Jin

Kim

Ham

et al. 2022

IEEE Trans. Comput.

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Yunho Jin

Layerweaver: Maximizing Resource Utilization of Neural Processing Units via Layer-Wise Scheduling

Layerweaver+: A QoS-Aware Layer-Wise DNN Scheduler for Multi-Tenant Neural Processing Units

A simple algorithm for determining of movement duration in task space without violating joint angle constraints

The Evolutionary Emergence of Neural Organization in a Hydra-like Animat

Architecting a Flash-Based Storage System for Low-Cost Inference of Extreme-Scale DNNs

Contact Info

Product

Resources

About