Taurus

Maas, Martin; Asanović, Krste; Harris, Tim; Kubiatowicz, John

doi:10.1145/2980024.2872386

Cited by 2 publications

References 34 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

A Study on Garbage Collection Algorithms for Big Data Environments

Bruno

Ferreira

2018

ACM Comput. Surv.

View full text Add to dashboard Cite

The need to process and store massive amounts of data—Big Data—is a reality. In areas such as scientific experiments, social networks management, credit card fraud detection, targeted advertisement, and financial analysis, massive amounts of information are generated and processed daily to extract valuable, summarized information. Due to its fast development cycle (i.e., less expensive to develop), mainly because of automatic memory management, and rich community resources, managed object-oriented programming languages (e.g., Java) are the first choice to develop Big Data platforms (e.g., Cassandra, Spark) on which such Big Data applications are executed. However, automatic memory management comes at a cost. This cost is introduced by the garbage collector, which is responsible for collecting objects that are no longer being used. Although current (classic) garbage collection algorithms may be applicable to small-scale applications, these algorithms are not appropriate for large-scale Big Data environments, as they do not scale in terms of throughput and pause times. In this work, current Big Data platforms and their memory profiles are studied to understand why classic algorithms (which are still the most commonly used) are not appropriate, and also to analyze recently proposed and relevant memory management algorithms, targeted to Big Data environments. The scalability of recent memory management algorithms is characterized in terms of throughput (improves the throughput of the application) and pause time (reduces the latency of the application) when compared to classic algorithms. The study is concluded by presenting a taxonomy of the described works and some open problems, with regard to Big Data memory management, that could be addressed in future works.

show abstract

A Study on Garbage Collection Algorithms for Big Data Environments

Bruno

Ferreira

2018

ACM Comput. Surv.

View full text Add to dashboard Cite

show abstract

From warm to hot starts

Carreira

Kohli

Bruno³

et al. 2021

Proceedings of the Workshop on Hot Topics in Operating Systems

View full text Add to dashboard Cite

The serverless computing model leverages high-level languages, such as JavaScript and Java, to raise the level of abstraction for cloud programming. However, today's design of serverless computing platforms based on stateless short-lived functions leads to missed opportunities for modern runtimes to optimize serverless functions through techniques such as JIT compilation and code profiling.In this paper, we show that modern serverless platforms, such as AWS Lambda, do not fully leverage language runtime optimizations. We find t hat a s ignificant nu mber of function invocations running on warm containers are executed with unoptimized code (warm-starts), leading to orders of magnitude performance slowdowns.We explore the idea of exploiting the runtime knowledge spread throughout potentially thousands of nodes to profile and optimize code. To that end, we propose Ignite, a serverless platform that orchestrates runtimes across machines to run optimized code from the start (hot-start). We present evidence that runtime orchestration has the potential to greatly reduce cost and latency of serverless workloads by running optimized code across thousands of serverless functions.

show abstract

Taurus

Cited by 2 publications

References 34 publications

A Study on Garbage Collection Algorithms for Big Data Environments

A Study on Garbage Collection Algorithms for Big Data Environments

From warm to hot starts

Contact Info

Product

Resources

About