2012 ACM/IEEE 13th International Conference on Grid Computing 2012
DOI: 10.1109/grid.2012.25
|View full text |Cite
|
Sign up to set email alerts
|

Data-Intensive Workload Consolidation for the Hadoop Distributed File System

Abstract: Abstract-Workload consolidation, sharing physical resources among multiple workloads, is a promising technique to save cost and energy in cluster computing systems. This paper highlights a few challenges of workload consolidation for Hadoop as one of the current state-of-the-art data-intensive cluster computing system. Through a systematic step-by-step procedure, we investigate challenges for efficient server consolidation in Hadoop environments. To this end, we first investigate the inter-relationship between… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
21
0

Year Published

2014
2014
2024
2024

Publication Types

Select...
4
2
1

Relationship

1
6

Authors

Journals

citations
Cited by 17 publications
(21 citation statements)
references
References 15 publications
0
21
0
Order By: Relevance
“…Since HCache utilizes a number of datanodes to enhance the parallel processing of the hot data, it is essential to validate the scalability of HCache .We test the Medium and Strong workload on HCache with 4,8,12,16,20,24 datanodes to show the system's scalability. We measure how long it takes to finish these two workloads.…”
Section: Scalabilitymentioning
confidence: 99%
See 1 more Smart Citation
“…Since HCache utilizes a number of datanodes to enhance the parallel processing of the hot data, it is essential to validate the scalability of HCache .We test the Medium and Strong workload on HCache with 4,8,12,16,20,24 datanodes to show the system's scalability. We measure how long it takes to finish these two workloads.…”
Section: Scalabilitymentioning
confidence: 99%
“…However, it is worth noting that a number of research reports [5] [6][7] [8] from industry or academia indicate that many applications of MapReduce exhibit strong skewed access patterns. In real production environment, each application accesses a small number of files or blocks from the cluster, the overall access pattern within the Hadoop is non-uniform, some files are accessed more frequently than others.…”
Section: Introductionmentioning
confidence: 99%
“…A promising performance isolation must ensure a certain performance level, which is also predictable for service providers, among consolidated applications in such a way that operations of one VS do not affect the performance level of other services that concurrently running in the same physical machine 5,6 . Past studies demonstrated a lack of responsiveness in almost all VM management engines to adequately segregate the performance degradation, which is usually occurred due to the run‐time interference, among concurrently running VSs 7‐14 …”
Section: Introductionmentioning
confidence: 99%
“…To optimize the performance of Hadoop cluster, we need to tune large number of configuration parameters [34]. Existing work has reported the impact of change in block size [18] and trying different replication strategies [26] to enhance throughput of the system. Each of these parameters can take more than one value.…”
Section: Hadoop Cluster Experimental Setup Design and Configuration Parameters Tuningmentioning
confidence: 99%
“…There have been a number of efforts to make current big data technologies more efficient, scalable and robust. These include improving load balancing and process scheduling capabilities across different environments [13] [14] [15] [16] [17] [18], improving security of data [19] [20] [21], opti-1 mized utilization of available data storage [22] [23], making of scalable and applied cloud technologies and developing performance matrices for distributed file systems [24][25][26][27][28]. Distributed computing paradigms, such as peers to peers, clusters, grids, and the cloud platforms, have been used for distributed data mining [29,30].…”
Section: Introductionmentioning
confidence: 99%