In this paper, we present a new algorithm for disk reconfiguration in the context of Vespa, a scalable platform developped by Yahoo! Technologies Norway for storing, retrieving, processing and searching large amounts large amounts of data. The corresponding scheduling problem is closely related to independent tasks scheduling on heterogeneous platforms, when communication costs are taken into account, and when each task can only be processed on a prescribed set of processors. We prove how to derive from a linear programming formulation in rational numbers an approximation algorithm whose approximation ratio is close to 1 in the condition of use of Vespa. By performing an extensive set of simulations using SIMGRID, we also show the proposed algorithm is in fact optimal under Vespa conditions of use.Corresponding Author: Olivier Beaumont -LaBRI -INRIA -Domaine Universitaire -351 cours de la libration -33405 TALENCE Cedex -France. Tel: +33 5 40 00 37 98 Fax +33 5 40 00 66 69
Abstract. This paper presents new techniques for master-slave tasking on treeshaped networks with fully heterogeneous communication and processing resources. A large number of independent, equal-sized tasks are distributed from the master node to the slave nodes for processing and return of result files. The network links present bandwidth asymmetry, i.e. the send and receive bandwidths of a link may be different. The nodes can overlap computation with at most one send and one receive operation. A centralized algorithm that maximizes the platform throughput under static conditions is presented. Thereafter, we propose several distributed heuristics making scheduling decisions based on information estimated locally. Extensive simulations demonstrate that distributed heuristics are better suited to cope with dynamic environments, but also compete well with centralized heuristics in static environments.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.