MapReduce is a remarkable parallel programming model as well as a parallel processing infrastructure for large-scale data processing. Since it is now widely available on cloud environments, developing methodology or patterns for MapReduce programming is important. In particular, XML is the de facto standard for representing data, and processing semi-structured data is involved in many applications. The target computational patterns in this paper are tree accumulations. Tree accumulations are shape-preserving computations over a tree in which values are updated through flows over the tree. We develop BSP algorithms for two tree accumulations as extensions of the BSP algorithm for tree reduction by Kakehi et al. (Tech. Rep. METR 2006-64, ). We also implemented the two-superstep algorithms with a single MapReduce execution. Experimental results on a 16-node PC cluster show good speedups of a factor of 10.9-12.7.
MapReduce is a framework for large-scale data processing proposed by Google, and its open-source implementation, Hadoop MapReduce, is now widely used. Several language systems have been proposed to make developing MapReduce programs easier, for instance, Sawzall, FlumeJava, Pig, Hive, and Crunch. These language systems mainly target applications that can be naturally solved by using a MapReduce-like programming model. In this study, we propose a new MapReduce-program generator that accepts programs manipulating one-dimensional arrays. By using the proposed generator, users only need to write sequential programs to generate Hadoop MapReduce programs automatically. We applied some program optimization techniques to the generation of Hadoop MapReduce programs. In this paper, we also report our experiment results that compare programs generated by the proposed generator with hand-written MapReduce programs.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.