2021
DOI: 10.1109/access.2021.3091675
|View full text |Cite
|
Sign up to set email alerts
|

Handling Non-Local Executions to Improve MapReduce Performance Using Ant Colony Optimization

Abstract: Improving the performance of the MapReduce scheduler is a primary objective, especially in a heterogeneous virtual cloud environment. A map task is assigned with an input split(IS) which consists of one or more data blocks. When a map task is assigned to more than one data block, non-local execution is performed. In classical MapReduce scheduling schemes, data blocks are copied over the network to a node in where the map task is running. This increases job latency and consumes more network bandwidth within and… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
3
2

Relationship

1
4

Authors

Journals

citations
Cited by 6 publications
(5 citation statements)
references
References 15 publications
0
5
0
Order By: Relevance
“…Hence, the number of (map tasks, reduce tasks) for each job is (1000, 20), (500, 15), (2000, 50), (1500, 10), (600, 0), and (1200, 5), respectively. Similarly, the latency of (map, reduce) tasks are approximately (7,13), (6,11), (5,10), (7,15), (6, 0), and (13, 27), seconds respectively. These latencies are the observed units from the real execution we conducted and fixed the same for our simulation.…”
Section: ) Workload Related Parametersmentioning
confidence: 97%
See 1 more Smart Citation
“…Hence, the number of (map tasks, reduce tasks) for each job is (1000, 20), (500, 15), (2000, 50), (1500, 10), (600, 0), and (1200, 5), respectively. Similarly, the latency of (map, reduce) tasks are approximately (7,13), (6,11), (5,10), (7,15), (6, 0), and (13, 27), seconds respectively. These latencies are the observed units from the real execution we conducted and fixed the same for our simulation.…”
Section: ) Workload Related Parametersmentioning
confidence: 97%
“…The authors focused on how the heterogeneous environment affects the performance in MapReduce job execution sequence. To improve data locality, minimize the number of non-local executions, and virtual network bandwidth consumption, ACO is used in [13], which splits and spreads the data block based on processing capacity VMs. A similar approach is used in [14] to improve the MapReduce scheduler performance in a heterogeneous environment.…”
Section: Literature Surveymentioning
confidence: 99%
“…Throughput: It is defined as the number of cloud tasks executed per unit time. The formula for calculating the Throughput is calculated as given in equation (10).…”
Section: 𝑇𝑆𝐸 = [mentioning
confidence: 99%
“…Improving Data Locality using the Ant Colony Optimization (IDLACO) algorithm was designed in [10] to decrease the number of non-local executions and bandwidth consumption. The designed algorithm was not efficient to achieve higher efficiency.…”
Section: Introductionmentioning
confidence: 99%
“…In recent years, batch processing in large commercial server clusters has made great progress, and batch-oriented big data analysis engines represented by MapReduce [1][2] and Spark [3] have established a programming model for batch processing of large data sets. With the rise of technologies such as the Internet of Things and edge computing, the stream computing model is becoming more and more popular, and a large number of applications in stream computing are processed in real time by pushing massive amounts of data generated in the external environment to the server [4].…”
Section: Introductionmentioning
confidence: 99%