Proceedings of the 2016 International Conference on Artificial Intelligence and Engineering Applications 2016
DOI: 10.2991/aiea-16.2016.40
|View full text |Cite
|
Sign up to set email alerts
|

Exploiting FastDFS Client-based Small File Merging

Abstract: Abstract. How to store masses of small files is a generally acknowledged problem in industry and academia. Small file merging is the most successful strategy to solve this problem, which has been supported by many distributed filesystems, such as FastDFS. However, in FastDFS, our experiments indicate that excessively wide striping causes performance degradation, and pre-allocated space causes data loss or error. Therefore, this paper presents a client-based small file merging scheme based on the characteristic… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
2
2

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(3 citation statements)
references
References 6 publications
(5 reference statements)
0
3
0
Order By: Relevance
“…With the use of the divide-and-conquer strategy in distributed computing, a big data file is partitioned into a number of small files, called data blocks, which are stored in a distributed manner on the disks of cluster nodes to improve I/O performance. A big data file stored in this way is called a distributed data file, which is managed on the cluster with a distributed file system [32,33] , e.g., GFS [8] , HDFS [34] , Taobao file system (TFS) [35,36] , and FastDFS [37] . The distributed file systems provide an important technical foundation for big data analysis [38] .…”
Section: Distributed File Systemsmentioning
confidence: 99%
See 1 more Smart Citation
“…With the use of the divide-and-conquer strategy in distributed computing, a big data file is partitioned into a number of small files, called data blocks, which are stored in a distributed manner on the disks of cluster nodes to improve I/O performance. A big data file stored in this way is called a distributed data file, which is managed on the cluster with a distributed file system [32,33] , e.g., GFS [8] , HDFS [34] , Taobao file system (TFS) [35,36] , and FastDFS [37] . The distributed file systems provide an important technical foundation for big data analysis [38] .…”
Section: Distributed File Systemsmentioning
confidence: 99%
“…TFS is a high-availability, highperformance distributed file system developed by Taobao to meet the storage requirements of unstructured small files (usually no more than 1 MB). FastDFS is a lightweight open-source distributed file system that is especially suitable for online services using files as the carrier [37] . HDFS, which was developed in the Apache Hadoop project, was designed to overcome the challenges of distributed data processing in a large-scale cluster.…”
Section: Distributed File Systemsmentioning
confidence: 99%
“…Finally, the consensus mechanism is used to realize the consensus of all nodes in the network, and legal blocks are joined in the blockchain so that the information on transactions cannot be tampered with. [25] is an open-source lightweight distributed file system developed by Using C language, which can work well on UNIX-like systems and pursue high performance and high scalability.…”
Section: Related Work and Backgroundmentioning
confidence: 99%