2022
DOI: 10.26599/tst.2020.9010047
|View full text |Cite
|
Sign up to set email alerts
|

Improved heuristic job scheduling method to enhance throughput for big data analytics

Abstract: Data-parallel computing platforms, such as Hadoop and Spark, are deployed in computing clusters for big data analytics. There is a general tendency that multiple users share the same computing cluster. The schedule of multiple jobs becomes a serious challenge. Over a long period in the past, the Shortest-Job-First (SJF) method has been considered as the optimal solution to minimize the average job completion time. However, the SJF method leads to a low system throughput in the case where a small number of shor… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
7
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
8
1
1

Relationship

0
10

Authors

Journals

citations
Cited by 17 publications
(10 citation statements)
references
References 17 publications
0
7
0
Order By: Relevance
“…In addition, some predictive work should be done in combination with clinical practice to give full play to the advantages of medical big data. The establishment of the medical big data life cycle is about the preparation of medical big data and the application of medical big data results [ 13 ].…”
Section: Data Mining and Apriori Algorithm Based On Medical Big Datamentioning
confidence: 99%
“…In addition, some predictive work should be done in combination with clinical practice to give full play to the advantages of medical big data. The establishment of the medical big data life cycle is about the preparation of medical big data and the application of medical big data results [ 13 ].…”
Section: Data Mining and Apriori Algorithm Based On Medical Big Datamentioning
confidence: 99%
“…is paper mainly analyzes the clustering interpolation method. e clustering algorithm mainly classifies the data according to the similarity of the data samples, and then divides the categories [25]. As shown in Figure 5, it is the clustering algorithm model.…”
Section: Optimize K-means Clustering Based On the Cuckoomentioning
confidence: 99%
“…It is a big data distributed computing paradigm, in which resources are abstracted and virtualized, and these virtualized resources can be dynamically delivered to users for on-demand use through the network. This makes cloud computing the preferred platform for integrating resources and providing services Hu et al (2022).This is very different from traditional model, especially in terms of scalability. Therefore, cloud computing can be divided into three categories according to types of services it provides, and can be divided into three types, namely Iaas (Infrastructure as a Service), PaaS (Platform as a Service), SaaS (Software as a Service).…”
Section: Intelligent Collection and Scheduling Inmentioning
confidence: 99%