Proceedings of the 2015 ACM Conference on Special Interest Group on Data Communication 2015
DOI: 10.1145/2785956.2787505
|View full text |Cite
|
Sign up to set email alerts
|

Low Latency Geo-distributed Data Analytics

Abstract: Low latency analytics on geographically distributed datasets (across datacenters, edge clusters) is an upcoming and increasingly important challenge. The dominant approach of aggregating all the data to a single datacenter significantly inflates the timeliness of analytics. At the same time, running queries over geo-distributed inputs using the current intra-DC analytics frameworks also leads to high query response times because these frameworks cannot cope with the relatively low and variable capacity of WAN … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

1
176
0

Year Published

2016
2016
2021
2021

Publication Types

Select...
6
3
1

Relationship

0
10

Authors

Journals

citations
Cited by 194 publications
(177 citation statements)
references
References 39 publications
1
176
0
Order By: Relevance
“…For example, replicating the straggling task on another available node is a common approach to deal with stragglers (e.g., [4]), while partial data replication is also used to reduce the communication load in distributed computing (e.g., [5]). However, there have been recent results demonstrating that coding can play a transformational role for creating and exploiting computation redundancy to effectively alleviate the impact of system noise.…”
Section: Introductionmentioning
confidence: 99%
“…For example, replicating the straggling task on another available node is a common approach to deal with stragglers (e.g., [4]), while partial data replication is also used to reduce the communication load in distributed computing (e.g., [5]). However, there have been recent results demonstrating that coding can play a transformational role for creating and exploiting computation redundancy to effectively alleviate the impact of system noise.…”
Section: Introductionmentioning
confidence: 99%
“…This class of techniques builds Virtual Machines (VMs) for users to use computing resources across geo-distributed datacenters as a single logical virtual cluster. These techniques primarily ptimize the data placement [10] [12], the latency of the services [12] [13] [14] , the Quality of Service(QoS) [11] [15], the electricity cost [13] [16] across multiple datacenters.…”
Section: Related Workmentioning
confidence: 99%
“…As a result the massive data sets, expected to be involved in big data analytics scenarios, can be geo-distributed across such global cloud platforms. Providing analytics, thus, over such data sets is fraught with efficiency and scalability limitations -as has been well recognized, for example in the Iridium project [45].…”
Section: F Research Theme 5: Global-scale Geo-distributed Seamentioning
confidence: 99%