2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS) 2018
DOI: 10.1109/ipdps.2018.00019
|View full text |Cite
|
Sign up to set email alerts
|

Performance Isolation of Data-Intensive Scale-out Applications in a Multi-tenant Cloud

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
13
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
4
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 17 publications
(13 citation statements)
references
References 21 publications
0
13
0
Order By: Relevance
“…In this context, a job is composed of multiple smaller tasks (defined as the smallest unit of computation observable by the resource manager) [82]. Such jobs and subsequent tasks are scheduled onto different machines in a parallelized manner to accelerate job completion and are often divided into phases creating a Direct Acyclic Graph (DAG) [83]. Application frameworks (such as MapReduce) attempt to sub-divide jobs so that tasks will approximately complete within the same timeframe for each phase [84].…”
Section: Straggler Definition and Impactmentioning
confidence: 99%
“…In this context, a job is composed of multiple smaller tasks (defined as the smallest unit of computation observable by the resource manager) [82]. Such jobs and subsequent tasks are scheduled onto different machines in a parallelized manner to accelerate job completion and are often divided into phases creating a Direct Acyclic Graph (DAG) [83]. Application frameworks (such as MapReduce) attempt to sub-divide jobs so that tasks will approximately complete within the same timeframe for each phase [84].…”
Section: Straggler Definition and Impactmentioning
confidence: 99%
“…The central resource manager (RM) is application-agnostic and completely unaware of runtime QoS requirements of interactive and latency-sensitive applications; RM is only responsible for resource allocation among jobs but leaves all application-specific logic to application managers. Existing solutions of workload co-location either aim at reducing the performance interference through resource partition and isolation [10] [11] [12] or leverage QoS-aware scheduling to place different jobs/applications by minimizing interference [13] [14] [15]. However, they are optimized towards the monolithic application and have indirect effects on DLRAs that have more sophisticated component dependencies and performance variations (e.g., latency) due to a vast number of requests across entire system components.…”
Section: Renyu Yang Is the Corresponding Authormentioning
confidence: 99%
“…For tasks without locality specifications, TOPOSCH is more likely to place the task onto a node with lower risk level to reduce the impact of co-location on the increased latency. To achieve this, we adopt a random number based approach to implicate the tendency of choosing low risk nodes with higher probability (Line [11][12][13][14]. Furthermore, to dominate DLRA's QoS, RM has the privilege to preempt and evict running batch tasks to rescue the detected QoS degradation.…”
Section: B Task Delay Scheduling Under Resource Reservationmentioning
confidence: 99%
“…Other elements of related work also rely on internal metrics such as memory and CPU usage [7,10,27,28]. PerfCloud [30] utilizes system-level metrics to proactively detect performance interference between tenant workloads and shows that such approaches succeed in avoiding costly workload profiling and prediction mechanisms without having any interference on application code.…”
Section: Related Workmentioning
confidence: 99%