2018 IEEE 11th International Conference on Cloud Computing (CLOUD) 2018
DOI: 10.1109/cloud.2018.00063
|View full text |Cite
|
Sign up to set email alerts
|

Serverless Data Analytics with Flint

Abstract: Serverless architectures organized around looselycoupled function invocations represent an emerging design for many applications. Recent work mostly focuses on user-facing products and event-driven processing pipelines. In this paper, we explore a completely different part of the application space and examine the feasibility of analytical processing on big data using a serverless architecture. We present Flint, a prototype Spark execution engine that takes advantage of AWS Lambda to provide a pure pay-as-you-g… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
24
0
2

Year Published

2018
2018
2023
2023

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 53 publications
(26 citation statements)
references
References 5 publications
0
24
0
2
Order By: Relevance
“…CFs? time TR-Spark [46] Yes No No n/a Apache Flink [8] Yes No Yes Yes Burscale [7] Yes No Yes Yes Qubole [36] No Yes No No Flint [26] No Yes No No ExCamera [20] No Yes n/a n/a numpywren [38] No Yes No No PyWren [24] No Yes No No Locus (PyWren+Redis) [35] No Yes Yes No Cirrus [25] No Yes Yes No gg [19] No Yes Yes No FEAT [32], MArk [49] Yes Yes n/a n/a SplitServe Yes Yes Yes Yes Table 1. A comparison of SplitServe against the state-of-theart platforms exploiting VMs and Cloud Functions (CFs).…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…CFs? time TR-Spark [46] Yes No No n/a Apache Flink [8] Yes No Yes Yes Burscale [7] Yes No Yes Yes Qubole [36] No Yes No No Flint [26] No Yes No No ExCamera [20] No Yes n/a n/a numpywren [38] No Yes No No PyWren [24] No Yes No No Locus (PyWren+Redis) [35] No Yes Yes No Cirrus [25] No Yes Yes No gg [19] No Yes Yes No FEAT [32], MArk [49] Yes Yes n/a n/a SplitServe Yes Yes Yes Yes Table 1. A comparison of SplitServe against the state-of-theart platforms exploiting VMs and Cloud Functions (CFs).…”
Section: Related Workmentioning
confidence: 99%
“…Redis, being an inmemory dictionary, significantly improves on I/O operations compared to disk writes, but is quite expensive as it requires the use of large VMs. Flint [26], another prototype of Spark on AWS Lambda, replaces AWS S3 with SQS [2] for intermediate data I/O using multiple distributed queues, which is a better fit for a high number of small writes. SQS does better in terms of throughput but is costlier and less reliable compared to AWS S3.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…These solutions can be classified into two types: (I) functions to orchestrate functions; and (II) external client schedulers. In the first category (e.g., [2], [3]), the orchestration is performed inside a serverless function. However, this approach suffers double billing according to the trilemma: The orchestrator function is billed while waiting for the execution of the orchestrated functions to complete (which are also billed).…”
Section: Related Workmentioning
confidence: 99%
“…In serverless computing, also referred to as Functions-asa-Service (FaaS), application developers provide an eventdriven function to cloud providers, and the cloud provider is responsible for seamlessly scaling function invocations to meet demands as event triggers occur. Serverless is powerful and expressive, with applications designed for video processing [29,41], HPC and scientic computing [36,51,89,93], machine learning [35,39,50], data analytics [44,55], chatbots [103], backends [31,67], IoT [69,102], and even general applications [40,92]. Indeed, a recent study of a production serverless oering indicates applications range from single functions to hundreds of functions in size, with function execution times ranging from less than a second to the order of minutes [88].…”
Section: Introductionmentioning
confidence: 99%