2021
DOI: 10.14778/3457390.3457399
|View full text |Cite
|
Sign up to set email alerts
|

Tensor relational algebra for distributed machine learning system design

Abstract: We consider the question: what is the abstraction that should be implemented by the computational engine of a machine learning system? Current machine learning systems typically push whole tensors through a series of compute kernels such as matrix multiplications or activation functions, where each kernel runs on an AI accelerator (ASIC) such as a GPU. This implementation abstraction provides little built-in support for ML systems to scale past a single machine, or for handling large models with matrices or te… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
3

Relationship

2
6

Authors

Journals

citations
Cited by 18 publications
(7 citation statements)
references
References 34 publications
0
7
0
Order By: Relevance
“…Our work complements prior work on linear algebra computation powered by database engines [7,14,29,35,45] and on languages that unify linear algebra and relational algebra [13,17,25]. No prior work considered the interaction of QR decomposition with database joins.…”
Section: Introductionmentioning
confidence: 74%
“…Our work complements prior work on linear algebra computation powered by database engines [7,14,29,35,45] and on languages that unify linear algebra and relational algebra [13,17,25]. No prior work considered the interaction of QR decomposition with database joins.…”
Section: Introductionmentioning
confidence: 74%
“…There has been a plethora of work in the past decade focusing on in-DB ML [80,44,37,70,53,63,32,67,50,56,45,57,78,48]. Most existing in-DB ML systems implement SGD as "User-Defined Aggregates" (UDA) [37,44].…”
Section: In-database Machine Learning Systemsmentioning
confidence: 99%
“…In-DB ML Previous work [80,44,37,70,53,63,32,67,50,56,45,57,78,48,58,15] has intensively discussed how to implement ML models on relational data, such as linear models [70,53,63], linear algebra [32,56,57], factorization models [67], neural networks [45,57,78] and other statistical learning models [50], using Batch Gradient Descent (BGD) or SGD, over join or self-defined matrix/tensors, etc. The most common way of integrating ML algorithm into RDBMS is to use User-Defined Aggregate Functions (UDA).…”
Section: Related Workmentioning
confidence: 99%
“…Second, the current version of BAGUA only focuses on data parallelism and it is interesting future work to integrate other techniques such as model parallelism (e.g. [40,41,42,43,44,45,46,47]) and pipeline parallelism (e.g., [48,49,50,51]) and to understand the system abstractions.…”
Section: Limitations and Moving Forwardmentioning
confidence: 99%
“…BAGUA is built on decades of research regarding distributed machine learning systems and algorithms. Plenty of them are from the database community [52,53,54,55,56,57,58,59,60,61,46,47]. We now summarize related work and discuss some in details to provide backgrounds and contexts.…”
Section: Preliminaries and Related Workmentioning
confidence: 99%