2018
DOI: 10.1007/s11432-017-9401-y
|View full text |Cite
|
Sign up to set email alerts
|

IO dependent SSD cache allocation for elastic Hadoop applications

Abstract: Elastic Hadoop applications consisting of multiple virtual machines (VMs) are widely used to support big data analysis and processing. In this scenario, flash-based solid state drive (SSD) is usually deployed on hypervisors and used as the cache to improve the IO performance. However, existing SSD caching schemes are mostly VM-centric, which focus on the low-level IO performance metrics of individual VMs. They may not lead to the optimized performance of elastic Hadoop applications, i.e., the job completion ti… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 7 publications
(7 citation statements)
references
References 13 publications
0
7
0
Order By: Relevance
“…For example, MapReduce based ecosystems such as Hadoop execute workloads on two stages, Map and Reduce. Each of them interacts with the local and the remote storage differently [8] .…”
Section: Modeling I/o Pattern For Cost-effective Storage Controllermentioning
confidence: 99%
See 3 more Smart Citations
“…For example, MapReduce based ecosystems such as Hadoop execute workloads on two stages, Map and Reduce. Each of them interacts with the local and the remote storage differently [8] .…”
Section: Modeling I/o Pattern For Cost-effective Storage Controllermentioning
confidence: 99%
“…Previous research have shown that utilizing SSD drives can improve big data workloads' performance but improvement depending on the workload differently based on the workload type and SSD usage approach. In the Hadoop ecosystem, several studies investigated the performance improvement of a variety of MapReduce workloads when introducing SSD into the ecosystem as a tier [59], [60], [61] [12] or as a cache [8]. To examine the difference in SSD utilization usage approaches impact on MapReduce workloads, [62] and [61] compared the performance improvement when applying SSD as a tier and as a cache.…”
Section: Storage Capability and Workload Characteristicmentioning
confidence: 99%
See 2 more Smart Citations
“…In order to improve cache performance, numerous efforts on cache allocation are proposed for parallel computing [16] , chip [7] , virtual machines [17,18] and Web search engines [19] . Venkatesan et al [18] figure out the miss ratio curve of virtual machines are varied and propose a dynamic approach to partition cleancache fairly.…”
Section: Related Work and Motivationmentioning
confidence: 99%