Proceedings of the 4th ACM SIGMOD Workshop on Algorithms and Systems for MapReduce and Beyond 2017
DOI: 10.1145/3070607.3070613
|View full text |Cite
|
Sign up to set email alerts
|

A containerized analytics framework for data and compute-intensive pipeline applications

Abstract: The joint effort of scientific collaborations and the expanding data market creates demand for high-performance and dataintensive analytics infrastructures that can exploit the potential of heterogeneous multi-core architectures with dynamic and scalable execution environments. Contemporary approaches focus on developing efficient parallel application models, but lack the flexibility of efficiently integrating and utilizing native or accelerator-based code. In this work, we illustrate a novel approach on mendi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
4
0

Year Published

2017
2017
2021
2021

Publication Types

Select...
2
1

Relationship

2
1

Authors

Journals

citations
Cited by 3 publications
(4 citation statements)
references
References 21 publications
0
4
0
Order By: Relevance
“…However, Data Civilizer does not benefit from the systematic use of data context that is described here; it could be extended to do so. On the contrary, Big data analytics platforms such as [7] and [6] focus on optimising the execution of composable data anlytics workflows according to data locality and data flow. Our platform focuses on the design and implementation of a scalable and modularised data wrangling workflow in a domain-independent manner.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…However, Data Civilizer does not benefit from the systematic use of data context that is described here; it could be extended to do so. On the contrary, Big data analytics platforms such as [7] and [6] focus on optimising the execution of composable data anlytics workflows according to data locality and data flow. Our platform focuses on the design and implementation of a scalable and modularised data wrangling workflow in a domain-independent manner.…”
Section: Related Workmentioning
confidence: 99%
“…Such steps can be carried out using traditional Extract-Transform-Load (ETL) or Big Data analytics platforms [5], [6], [7], both requiring significant manual involvement in specifying, configuring, programming or tuning many of the steps [8], [9]. It is widely reported that intense manual involvement in such processes is expensive (e.g., [10]), often representing more than half the time of data scientists.…”
Section: ç 1 Introductionmentioning
confidence: 99%
“…Such steps can be carried out using Extract-Transform-Load (ETL) [3] or Big Data analytics platforms [4], both necessitating significant manual involvement in specifying, configuring, programming or tuning many of the steps. It is widely reported that intense manual involvement in such processes is expensive (e.g.…”
Section: Introductionmentioning
confidence: 99%
“…Transformed, integrated and repaired records schematic correspondences. We utilize the Coma 3.0 community edition4 , specifically, the Coma workflow (configuration 7001) combining different metadata-based match heuristic. When data context is provided in D, each such data set is used as a partial extensional representation of the target to carry out instance based matching with the source (line 5).…”
mentioning
confidence: 99%