2015 IEEE/ACIS 16th International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distribu 2015
DOI: 10.1109/snpd.2015.7176191
|View full text |Cite
|
Sign up to set email alerts
|

Multicast-based replication for Hadoop HDFS

Abstract: The Hadoop HDFS is a popular open-source distributed storage system, which serves as the foundation of many important big-data technologies. The performance of data replication is crucial to HDFS, since it accounts for a major portion of network traffic in the entire cluster. In this research, we propose to enable multicast-based replication, which is expected to use less network bandwidth than the native TCP-based pipelined replication method. We developed a congestion-controlled reliable multicast socket (th… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2017
2017
2022
2022

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(2 citation statements)
references
References 17 publications
0
2
0
Order By: Relevance
“…With the rapid advancement of the Apache Hadoop project, HDFS now serves a thriving ecosystem of open-source big data technologies, such as Giraph, Spark, Pig, Hive, and HBase. The performance of HDFS is crucial to all these components built on top of it (Wu & Hong, 2015).…”
Section: Hadoop Distributed File System Hdfsmentioning
confidence: 99%
“…With the rapid advancement of the Apache Hadoop project, HDFS now serves a thriving ecosystem of open-source big data technologies, such as Giraph, Spark, Pig, Hive, and HBase. The performance of HDFS is crucial to all these components built on top of it (Wu & Hong, 2015).…”
Section: Hadoop Distributed File System Hdfsmentioning
confidence: 99%
“…The various features of the GFS also available in Hadoop as automatic failure management, flexible horizontal scalability, check sum correction and file redundancy (Wu and Hong, 2015). HDFS provides the highest level of fault tolerance over the low cost computer clusters.…”
Section: Hdfsmentioning
confidence: 99%