2006
DOI: 10.1007/s10586-006-0011-6
|View full text |Cite
|
Sign up to set email alerts
|

Exploiting redundancy to boost performance in a RAID-10 style cluster-based file system

Abstract: While aggregating the throughput of existing disks on cluster nodes is a cost-effective approach to alleviate the I/O bottleneck in cluster computing, this approach suffers from potential performance degradations due to contentions for shared resources on the same node between storage data processing and user task computation. This paper proposes to judiciously utilize the storage redundancy in the form of mirroring existed in a RAID-10 style file system to alleviate this performance degradation. More specific… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2006
2006
2018
2018

Publication Types

Select...
4

Relationship

1
3

Authors

Journals

citations
Cited by 4 publications
(4 citation statements)
references
References 23 publications
0
4
0
Order By: Relevance
“…A similar microbenchmark is also used to evaluate the read performance [65,66]. In addition, we propose to use the techniques of doubling the degree of parallelism and hot-spot skipping to improve the aggregate read performance.…”
Section: Read Performance and Real Application Benchmarkmentioning
confidence: 99%
“…A similar microbenchmark is also used to evaluate the read performance [65,66]. In addition, we propose to use the techniques of doubling the degree of parallelism and hot-spot skipping to improve the aggregate read performance.…”
Section: Read Performance and Real Application Benchmarkmentioning
confidence: 99%
“…As data throughput is the most important objective of PVFS, some expensive but indispensable functions such as the concurrent control between data and metadata are not fully designed and implemented. In CEFT [6], [10], [13], [17], which is an extension of PVFS to incorporate a RAID-10-style fault tolerance and parallel I/O scheduling, the MS synchronizes concurrent updates, which can limit the overall throughput under the workload of intensive concurrent metadata updates. In Lustre [1], some low-level metadata management tasks are offloaded from the MS to object storage devices, and ongoing efforts are being made to decentralize metadata management to further improve the scalability.…”
Section: Related Work and Comparison Of Decentralization Schemesmentioning
confidence: 99%
“…To divert the high volume of user data traffic to bypass any single centralized component, the functions of data and metadata managements are usually decomposed, and metadata is stored separately on different nodes away from user data. Although previous work on cluster-based storage mainly focuses on optimizing the scalability and efficiency of user data accesses by using a RAID-style striping [3], [10], caching [11], scheduling [12], [13], and networking [14], little attention has been drawn to the scalability of metadata management.…”
Section: Introductionmentioning
confidence: 99%
“…Each block (approximately 64 megabytes (MB)) is then stored in multiple different storage nodes to enhance concurrency and system performance [ 1 ]. Moreover, a number of other similar systems, such as RAID (Redundant Array of Independent Disks) systems [ 2 ] and geospatial information systems (GISs) [ 3 ], have been developed, all of which use declustering technologies for the distributed storage of large files.…”
Section: Introductionmentioning
confidence: 99%