2018
DOI: 10.14778/3213880.3213890
|View full text |Cite
|
Sign up to set email alerts
|

Evaluating end-to-end optimization for data analytics applications in weld

Abstract: Modern analytics applications use a diverse mix of libraries and functions. Unfortunately, there is no optimization across these libraries, resulting in performance penalties as high as an order of magnitude in many applications. To address this problem, we proposed Weld, a common runtime for existing data analytics libraries that performs key physical optimizations such as pipelining under existing, imperative library APIs. In this work, we further develop the Weld vision by designing an automatic adaptive op… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
31
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
4
2
2

Relationship

1
7

Authors

Journals

citations
Cited by 52 publications
(31 citation statements)
references
References 34 publications
(49 reference statements)
0
31
0
Order By: Relevance
“…Python and R offer popular libraries that are easy to use and provide fast development cycles. These libraries are embedded shallowly [30] in the host language, i.e., they are executed as-is, without any inter-library optimizations and support for large data [53]. General purpose dataflow systems [71,2] provide second-order functions (e.g., map and reduce) to transform collections via userdefined functions (UDFs).…”
Section: Introductionmentioning
confidence: 99%
“…Python and R offer popular libraries that are easy to use and provide fast development cycles. These libraries are embedded shallowly [30] in the host language, i.e., they are executed as-is, without any inter-library optimizations and support for large data [53]. General purpose dataflow systems [71,2] provide second-order functions (e.g., map and reduce) to transform collections via userdefined functions (UDFs).…”
Section: Introductionmentioning
confidence: 99%
“…We discuss workloads below. NumPy Numerical Analysis (Figures 4a-d [55]. The number in parentheses shows the number of library API calls.…”
Section: End-to-end Performance Resultsmentioning
confidence: 99%
“…We evaluate the end-to-end performance benefits of SAs using Mozart on a suite of 15 data analytics workloads (of which four are repeated in NumPy and Intel MKL). Eight of the benchmarks are taken from the Weld evaluation [55], which obtained them from popular GitHub repositories, Kaggle competitions, and online tutorials. We also evaluate an additional numerical analysis workload (Shallow Water) over matrix operations, taken from the Bohrium paper [46] (Bohrium is an optimizing NumPy compiler that we compare against here).…”
Section: Workloadsmentioning
confidence: 99%
See 1 more Smart Citation
“…Most models for data programming typically involve different types of side-effect free transformations on data collections and are adequately covered by prior work on Weld [12,13] among others. The purpose of Arc is to complement Weld with data stream semantics.…”
Section: Preliminaries: Weld Ir Modelmentioning
confidence: 99%