Machine learning powers diverse services in industry including search, translation, recommendation systems, and security. The scale and importance of these models require that they be efficient, expressive, and portable across an array of heterogeneous hardware devices. These constraints are often at odds; in order to better accommodate them we propose a new high-level intermediate representation (IR) called Relay. Relay is being designed as a purely-functional, statically-typed language with the goal of balancing efficient compilation, expressiveness, and portability. We discuss the goals of Relay and highlight its important design constraints. Our prototype is part of the open source NNVM compiler framework, which powers Amazon's deep learning framework MxNet.
No abstract
Garbage collected language runtimes must carefully tune heap limits to reduce garbage collection time and memory usage. However, there's a trade-off: a lower heap limit reduces memory use but increases garbage collection time. Classic methods for setting heap limits include manually-tuned heap limits and multiple-ofworking-memory rules of thumb. But because it's a trade-off, it's not clear what heap limit rule is best or how even to compare them.We address this problem with a new framework where heap limits are set for multiple heaps at once. In this framework, standard heap limit rules are non-compositional: multiple heaps using the same heap limit rule allocate memory in non-Pareto-optimal ways. We use our framework to derive a compositional "square-root" heap limit rule, which achieves minimizes total memory usage for any amount of total garbage collection time. Paradoxically, the square-root heap limit rule achieves coordination without communication: it allocates memory optimally across multiple heaps without requiring any communication between heaps.To demonstrate that this heap limit rule is effective, we prototype it for V8, the JavaScript runtime used in Google Chrome, Microsoft Edge, and other browsers, as well as in server-side frameworks like node.js and Deno. On real-world web pages, our prototype achieves reductions of approximately 16.99% of memory usage. On memory-intensive benchmarks, reductions of up to 6.55% of garbage collection time are possible with no change in total memory usage.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.