2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (Ccgrid 2012) 2012
DOI: 10.1109/ccgrid.2012.134
|View full text |Cite
|
Sign up to set email alerts
|

Workflow Scheduling to Minimize Data Movement Using Multi-constraint Graph Partitioning

Abstract: Among scheduling algorithms of scientific workflows, the graph partitioning is a technique to minimize data transfer between nodes or clusters. However, when the graph partitioning is simply applied to a complex workflow DAG, tasks in each parallel phase are not always evenly assigned to computation nodes since the graph partitioning algorithm is not aware of edge directions that represent task dependencies. Thus, we propose a new method of task assignment based on Multi-Constraint Graph Partitioning. This met… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
45
0

Year Published

2014
2014
2021
2021

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 58 publications
(45 citation statements)
references
References 17 publications
(23 reference statements)
0
45
0
Order By: Relevance
“…Single-objective optimization approaches are focused on minimizing the workflow makespan through scheduling [20,22,23] and task clustering approach [1]. However, single-objective optimization is not sufficient for data intensive workflows.…”
Section: Related Work and Discussionmentioning
confidence: 99%
See 2 more Smart Citations
“…Single-objective optimization approaches are focused on minimizing the workflow makespan through scheduling [20,22,23] and task clustering approach [1]. However, single-objective optimization is not sufficient for data intensive workflows.…”
Section: Related Work and Discussionmentioning
confidence: 99%
“…For example, Tanaka and Tatebe [20] proposed a data-aware scheduling strategy that reduces makespan by minimizing data movement between cluster nodes was presented. However, their strategy can only operate with homogeneous resources during workflow execution and is not suitable for use in cloud environments.…”
Section: Related Work and Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…Consequently, this endeavor introduces a number of problems: (i) different users compete for resources within the cloud computing environment [15][16][17]; (ii) data needs to be transferred from one resource to another [18][19][20][21][22][23]; (iii) the inter-dependency between tasks introduces high communication cost and/or time [24][25][26][27][28][29][30]; and (iv) the trade-off between QoS of user requirements and the cost of workflow execution [1,10,13,[31][32][33][34]. However, this review targets execution cost-related problems which can possibly affect both service consumers and utility providers.…”
Section: Introductionmentioning
confidence: 99%
“…The Pwrake workflow system executes a workflow written in Rakefile in parallel assigning processes on a compute node where the input data is stored. The Pwrake, moreover, allocates processes to minimize data transfer size among compute nodes during a whole workflow execution by multi-constraint graph partitioning [34], which can reduce data transfer size of intermediate data generated during the workflow execution.…”
Section: Distributed Object Store For Ebd Applicationsmentioning
confidence: 99%