2017 46th International Conference on Parallel Processing (ICPP) 2017
DOI: 10.1109/icpp.2017.48
|View full text |Cite
|
Sign up to set email alerts
|

A Coflow-Based Co-Optimization Framework for High-Performance Data Analytics

Abstract: Abstract-Efficient execution of distributed database operators such as joining and aggregating is critical for the performance of big data analytics. With the increase of the compute speedup of modern CPUs, reducing the network communication time of these operators in large systems is becoming increasingly important, and also challenging current techniques. Significant performance improvements have been achieved by using state-of-the-art methods, such as reducing network traffic designed in the data management… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0

Year Published

2017
2017
2021
2021

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 12 publications
(6 citation statements)
references
References 29 publications
(61 reference statements)
0
6
0
Order By: Relevance
“…Fu et al implemented the Seagull system to optimize the average task completion time while also providing deadlines for tasks with guarantees. Cheng et al raised a joint optimization framework for high performance data analysis applications based on Coflow. Jiang et al monitored the network status based on the SDN technology and rerouted the bottleneck flow in the task to the lightly loaded link to reduce the task completion time.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Fu et al implemented the Seagull system to optimize the average task completion time while also providing deadlines for tasks with guarantees. Cheng et al raised a joint optimization framework for high performance data analysis applications based on Coflow. Jiang et al monitored the network status based on the SDN technology and rerouted the bottleneck flow in the task to the lightly loaded link to reduce the task completion time.…”
Section: Related Workmentioning
confidence: 99%
“…Fu et al 23 implemented the Seagull system to optimize the average task completion time while also providing deadlines for tasks with guarantees. Cheng et al 24 mechanism to satisfy the deadline requirements of flows in multi-resource environments.…”
Section: Of 14mentioning
confidence: 99%
“…POPI [8] is a lightweight algorithm targeting for efficient processing large outer joins under Hadoop. In order to improve the performance in large distributed environments, a new framework called CCF [9] is proposed to co-optimize application-level data movement and network-level data communications for distributed operators. Rayon [10] is proposed to reserve resources for production jobs and best-effort jobs such that the SLAs for production jobs can be guaranteed and meanwhile the execution time of best-effort jobs can be reduced.…”
Section: 22mentioning
confidence: 99%
“…Consequently, designing effective algorithms to do array redistribution is crucial for implementing distributed memory system software for these programming languages. In addition to array redistribution in parallel systems, in recent years, different data redistribution issues also received a lot of attention, including data redistribution in wireless sensor networks to minimize power consumption [5], data redistribution and retrieval in wireless sensor networks to preserve data [6], data redistribution in parallel join operations in distributed system [7], data distribution without breaking its semantics [8], data redistribution of data between processes [9], co-optimized application-level data movement and network-level data communications for distributed operators [10].…”
Section: Introductionmentioning
confidence: 99%