2020
DOI: 10.1007/s11227-020-03307-w
|View full text |Cite
|
Sign up to set email alerts
|

Predicting the performance of big data applications on the cloud

Abstract: Big data analytics have become widespread as a means to extract knowledge from large datasets. Such applications are often characterized by highly heterogeneous and irregular data access patterns, challenging existing software and hardware infrastructures to meet their dynamic resource demands. The cloud computing paradigm, in turn, offers a natural hosting solution to such applications as it provides flexibility and elasticity, adapting the allocated resources in response to the application's current needs. H… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
11
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
3
3

Relationship

0
6

Authors

Journals

citations
Cited by 11 publications
(11 citation statements)
references
References 27 publications
0
11
0
Order By: Relevance
“…Table 1 summarizes major similarities and differences between our work and the existing studies. First, the examined papers concern various application domains: load sharing facility (LSF) [8], parallel program [9], [11], [25], cloud [10], [29], HPC [2], [14], [15], [19], location-based services [20]- [23], databases [26]- [28], big data applications [29], [30], and scientific workloads [12], [13], [16]- [19]. The runtime estimation problem addressed in this paper applies to the scientific workloads domain.…”
Section: Related Workmentioning
confidence: 99%
See 3 more Smart Citations
“…Table 1 summarizes major similarities and differences between our work and the existing studies. First, the examined papers concern various application domains: load sharing facility (LSF) [8], parallel program [9], [11], [25], cloud [10], [29], HPC [2], [14], [15], [19], location-based services [20]- [23], databases [26]- [28], big data applications [29], [30], and scientific workloads [12], [13], [16]- [19]. The runtime estimation problem addressed in this paper applies to the scientific workloads domain.…”
Section: Related Workmentioning
confidence: 99%
“…Second, the target of estimation is slightly different across the studies we have investigated. Most of the existing works [2], [9]- [18], [29], [30] aim to predict the runtime. Some studies [2], [8] attempt to estimate the memory usage.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…The many details of the Spark framework, its scheduler, and the other parameters of the cluster require advanced mathematical models to accurately predict the job completion time of a workload. The three main approaches in existing literature on the topic are simulation [10], [11], machine learning [12], [13], [14], [15], and analytical modeling [16], [17], [18]. Efforts to simulate a Spark cluster have resulted in a number of comprehensive simulators that mimic the actual framework's behavior [10].…”
Section: Introductionmentioning
confidence: 99%