The computation of Contaminant Source Characterization (CSC) is a critical research issue in Water Distribution System (WDS) management. We use a simulation framework to identify optimized locations of sensors that lead to fast detection of contamination sources [1,2]. The optimization engine is based on a Genetic Algorithm (GA) that interprets trial solutions as individuals. During the optimization process many thousands of these solutions are generated. For a large WDS, the calculation of these solutions are non-trivial and time consuming. Hence, it is a compute intensive application that requires significant compute resources. Furthermore, we strive to generate solutions quickly in order to respond to the urgency of a response. Grids and Clouds [3] can help in several ways. First, they can provide infrastructure that is of sufficient computational power. Second, they allows the introduction of fault tolerant mechanisms, as ample resources could be made available. Third, due to the power of the available systems fast performance can be achieved. However, the approaches to utilize Grids and Cloud requires the availability of "software stacks" that enable the application developer to more easily use the Infrastructure provided by Grids and Clouds. We provide two distinct platforms.The cyberaide platform: To carry out the calculations we require user-level middleware that can be supporting the workflow [4,5] of the application and manages the resource assignment in an efficient and fault tolerant fashion. To do so we have prototyped the cyberaide framework that provides a convenient command line and portal layer of steering applications on Grids. Internally, we utilize a sophisticated workflow engine that provides the ability to access elementary fault tolerant mechanisms for job scheduling. This includes the management of job replicas and the 1 reaction on late return of results.The Hadoop platform: We report the test results of CSC problem solving on a real Grid test bed -the TeraGrid test bed. In addition, we contrast this system architecture with a Hadoop-based implementation that automatically includes fault tolerance. The later activity has been conducted on FutureGrid [6].We find that the cyberaide platform provided better performance and also allows us to more easily introduce custom designed services. Thus, in case the user has access to Grids and is interested in performance, cyberaide is a good choice.We find Hadoop provides an easy to use programming framework, that abstracts the application user from the infrastructure. FutureGrid and TeraGrid were essential resources for this work. TeraGrid allowed us to explore the Grid infrastructure for this problem, while FutureGrid allowed us to consider the Hadoop platform.
AbstractWe present a workflow-based algorithm for identifying threads to an urban water management system. Through Grid computing we provide the necessary high-performance computing resources to deliver quickly solutions to the problem. We prototyped a new middleware called cyberaide, that...