Proceedings 20th IEEE International Parallel &Amp; Distributed Processing Symposium 2006
DOI: 10.1109/ipdps.2006.1639564
|View full text |Cite
|
Sign up to set email alerts
|

Benefits of high speed interconnects to cluster file systems: a case study with Lustre

Abstract: Abstract

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
1
0

Year Published

2011
2011
2022
2022

Publication Types

Select...
3
3

Relationship

0
6

Authors

Journals

citations
Cited by 7 publications
(5 citation statements)
references
References 12 publications
0
1
0
Order By: Relevance
“…The shared storage may be a dual-hosted hard drive, a networked storage using RAID, or a distributed replicated block device (DRBD) [154]. Such solutions have been extensively used in HPC environments for critical system services, such as the job and resource manager (e.g., SLURM [192] and Sun Grid Engine (SGE) [175]) and the parallel file system MDS (e.g., Parallel Virtual File System (PVFS) [146] and Lustre [194]). • Active/hot-standby redundancy using a commit protocol for state replication has been implemented for some HPC job and resource managers as part of high availability cluster solutions, such as HA-OSCAR [119] with its commit protocol for OpenPBS [16].…”
Section: Rationalementioning
confidence: 99%
See 3 more Smart Citations
“…The shared storage may be a dual-hosted hard drive, a networked storage using RAID, or a distributed replicated block device (DRBD) [154]. Such solutions have been extensively used in HPC environments for critical system services, such as the job and resource manager (e.g., SLURM [192] and Sun Grid Engine (SGE) [175]) and the parallel file system MDS (e.g., Parallel Virtual File System (PVFS) [146] and Lustre [194]). • Active/hot-standby redundancy using a commit protocol for state replication has been implemented for some HPC job and resource managers as part of high availability cluster solutions, such as HA-OSCAR [119] with its commit protocol for OpenPBS [16].…”
Section: Rationalementioning
confidence: 99%
“…An implementation of HA-OSCAR supported high availability clustering for two job and resource managers, OpenPBS [16] and SGE [175]). Parallel file system MDSs, such as Lustre [194], support high availability clustering as well. • Active/standby redundancy also plays a role in resilience for parallel applications in HPC environments.…”
Section: Rationalementioning
confidence: 99%
See 2 more Smart Citations
“…There have been many efforts in parallel and distributed data management systems to provide large I/O bandwidth [4,16,23]. However, metadata management is still a challenging problem in widely distributed large-scale storage systems.…”
Section: Introductionmentioning
confidence: 99%