2011
DOI: 10.1016/j.jpdc.2010.08.004
|View full text |Cite
|
Sign up to set email alerts
|

BlobSeer: Next-generation data management for large scale infrastructures

Abstract: As data volumes increase at a high speed in more and more application fields of science, engineering, information services, etc., the challenges posed by data-intensive computing gain an increasing importance. The emergence of highly scalable infrastructures, e.g. for cloud computing and for petascale computing and beyond introduces additional issues for which scalable data management becomes an immediate need. This paper brings several contributions. First, it proposes a set of principles for designing highly… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
74
0

Year Published

2011
2011
2015
2015

Publication Types

Select...
4
3
2

Relationship

4
5

Authors

Journals

citations
Cited by 106 publications
(74 citation statements)
references
References 26 publications
0
74
0
Order By: Relevance
“…Finally we analyze the data storage solution provided by BlobSeer [14]. This solution represents data as BLOBS taking into consideration that most data in circulation is unstructured.…”
Section: Data Storage and Aggregation Solutionmentioning
confidence: 99%
See 1 more Smart Citation
“…Finally we analyze the data storage solution provided by BlobSeer [14]. This solution represents data as BLOBS taking into consideration that most data in circulation is unstructured.…”
Section: Data Storage and Aggregation Solutionmentioning
confidence: 99%
“…The service is designed to respect all the requirements and constraints imposed by data-intensive applications and utilizes multiple features of BlobSeer [14] such as data stripping, distributed metadata management and versioning-based concurrency control. The DDAS is designed to ensure scalability, fault tolerance and data retrieval performance [16].…”
Section: Introductionmentioning
confidence: 99%
“…PVFS [23]) or cloud storage repositories (e.g. Amazon S3 [24]) to specialized storage systems [25] and even local storage. Local storage is particularly attractive, because it is much faster and more scalable compared to conventional approaches.…”
Section: B Architecturementioning
confidence: 99%
“…Our approach is based on shadowing techniques [15], which means to offer the illusion of creating a new standalone snapshot of the file for each update to it but to physically store only the differences and manipulate metadata in such way that the aforementioned illusion is upheld. Starting from the principles introduced in [16], we propose to enable concurrent MPI processes to write their non-contiguous regions in complete isolation, without having to care about overlappings and synchronization, which is made possible by keeping data immutable: new differences are added rather than modify an existing snapshot. It is at the metadata level where the ordering is done and the overlappings are resolved in such way as to expose a snapshot of the file that looks as if all differences were applied in an arbitrary sequential order.…”
Section: Design Principlesmentioning
confidence: 99%