2023
DOI: 10.1186/s13677-023-00520-9
|View full text |Cite
|
Sign up to set email alerts
|

MapReduce scheduling algorithms in Hadoop: a systematic study

Soudabeh Hedayati,
Neda Maleki,
Tobias Olsson
et al.

Abstract: Hadoop is a framework for storing and processing huge volumes of data on clusters. It uses Hadoop Distributed File System (HDFS) for storing data and uses MapReduce to process that data. MapReduce is a parallel computing framework for processing large amounts of data on clusters. Scheduling is one of the most critical aspects of MapReduce. Scheduling in MapReduce is critical because it can have a significant impact on the performance and efficiency of the overall system. The goal of scheduling is to improve pe… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
2
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
7

Relationship

0
7

Authors

Journals

citations
Cited by 7 publications
(3 citation statements)
references
References 88 publications
0
2
0
Order By: Relevance
“…Concurrently, cloud computing has emerged to provide seamless access to extensive computing resources, networking, and storage capabilities, ensuring that applications can effectively handle large datasets. Owing to its wide range of applications and benefits, the MapReduce framework has been applied across various domains [ 64 , 65 ].…”
Section: Methodsmentioning
confidence: 99%
“…Concurrently, cloud computing has emerged to provide seamless access to extensive computing resources, networking, and storage capabilities, ensuring that applications can effectively handle large datasets. Owing to its wide range of applications and benefits, the MapReduce framework has been applied across various domains [ 64 , 65 ].…”
Section: Methodsmentioning
confidence: 99%
“…While this examination delves into potential avenues for improving Hadoop's performance and resource allocation strategies in the event of unforeseen failures, it also underscores the need for additional exploration into the scalability of these strategies and their potential influence on the overall system performance. A study has been conducted on data placement policies in Hadoop, presenting a modified approach to enhance system performance, especially in environments characterized by heterogeneity [22]. However, striking a harmonious balance between optimizing data placement and accommodating real-time data access patterns may present its own set of challenges.…”
Section: Related Workmentioning
confidence: 99%
“…MapReduce [5,6] is a distributed parallel computing technology. The cores of Hadoop are the MapReduce computing framework and the distributed file system HDFS [7].…”
Section: Introductionmentioning
confidence: 99%