In the era of big data, mining and analysis of the enormous amount of data has been widely used to support decision-making. This complex process including huge-volume data collecting, storage, transmission, and analysis could be modeled as workflow. Meanwhile, cloud environment provides sufficient computing and storage resources for big data management and analytics. Due to the clouds providing the pay-as-you-go pricing scheme, executing a workflow in clouds should pay for the provisioned resources. Thus, cost-effective resource provisioning for workflow in clouds is still a critical challenge. Also, the responses of the complex data management process are usually required to be real-time. Therefore, deadline is the most crucial constraint for workflow execution. In order to address the challenge of cost-effective resource provisioning while meeting the real-time requirements of workflow execution, a resource provisioning strategy based on dynamic programming is proposed to achieve cost-effectiveness of workflow execution in clouds and a critical-path based workflow partition algorithm is presented to guarantee that the workflow can be completed before deadline. Our approach is evaluated by simulation experiments with real-time workflows of different sizes and different structures. The results demonstrate that our algorithm outperforms the existing classical algorithms.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.