2016 IEEE Symposium on Service-Oriented System Engineering (SOSE) 2016
DOI: 10.1109/sose.2016.73
|View full text |Cite
|
Sign up to set email alerts
|

Computing at Massive Scale: Scalability and Dependability Challenges

Abstract: -Large-scale Cloud systems and big data analytics frameworks are now widely used for practical services and applications. However, with the increase of data volume, the heterogeneity of workloads and resources, together with the dynamicity of massive user requests, the uncertainties and complexity of resource scheduling and service provisioning increase dramatically, often resulting in poor resource utilization, vulnerable system dependability, and negative user-perceived performance impacts. In this paper, we… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
16
0

Year Published

2017
2017
2023
2023

Publication Types

Select...
4
3
2

Relationship

5
4

Authors

Journals

citations
Cited by 25 publications
(16 citation statements)
references
References 51 publications
0
16
0
Order By: Relevance
“…With big data processing demands soaring and service decoupling, there is a general manifestation that heterogeneous workloads (in terms of execution durations, resource patterns, etc.) run and operate in the data center cluster [39]. Herein, we make coarse comparisons shown in Table 1.…”
Section: Problem Definitionmentioning
confidence: 99%
“…With big data processing demands soaring and service decoupling, there is a general manifestation that heterogeneous workloads (in terms of execution durations, resource patterns, etc.) run and operate in the data center cluster [39]. Herein, we make coarse comparisons shown in Table 1.…”
Section: Problem Definitionmentioning
confidence: 99%
“…Specifically these data sets are too big to store on a single machine and so must be distributed. Already the growth of data is exponential [54] and increasing data collection and further cloud services will only accelerate this further [55]. Very quickly this could lead to a situation where we are no longer able to process the vast amount of data being collected.…”
Section: Challenge: Data Explosionmentioning
confidence: 99%
“…Faults may occur simultaneously and in any aspect of system operations ranging from application to hardware, and may have a wide variety of causes, including insufficient memory (OOM), overweight system utilization, performance interference, network congestion, server faults (e.g. disk, middleware software), and applications crash or hanging etc [8].…”
Section: Motivationmentioning
confidence: 99%
“…Dependability is a key concern for Cloud resource managers due to increasingly common failures which are now the norm rather than the exception caused by the enlarged system scale and complexity [6] [7] [8], different workload characteristics, and plethora of faults types that can activate. Failures within a resource manager have the potential to cause significant economic consequences to Cloud providers due to loss of service to consumers [9], and could affect services provisioned to millions globally in the event of Manuscript received Jun 10, 2015; revised Dec 31, 2015; accepted Mar 3, 2016. catastrophic failures.…”
Section: Introductionmentioning
confidence: 99%