Optimizing Data Processing: A Comparative Study of Big Data Platforms in Edge, Fog, and Cloud Layers
Thanda Shwe,
Masayoshi Aritsugi
Abstract:Intelligent applications in several areas increasingly rely on big data solutions to improve their efficiency, but the processing and management of big data incur high costs. Although cloud-computing-based big data management and processing offer a promising solution to provide scalable and abundant resources, the current cloud-based big data management platforms do not properly address the high latency, privacy, and bandwidth consumption challenges that arise when sending large volumes of user data to the clo… Show more
“…Shwe et al [21] analyzed the efficacy of SBC-based clusters in three application scenarios. This work compares big data processing platforms across three computing paradigms-batch, stream, and function processing-in resource-constrained environments such as edge and fog computing, versus traditional cloud deployments.…”
Section: Sbc In Cloud Edge Clustersmentioning
confidence: 99%
“…The above-mentioned works highlight the proposition of deploying Hadoop clusters in edge environments with SBCs like Raspberry Pi as a viable option [19][20][21][23][24][25][26][27], driven by cost-effectiveness, energy efficiency, sustainability, and flexibility. Although it may require addressing certain challenges, the benefits in terms of reduced latency, scalability, and sustainability make it a compelling choice for many edge and remote scenarios.…”
Efficient resource allocation is crucial in clusters with frugal Single-Board Computers (SBCs) possessing limited computational resources. These clusters are increasingly being deployed in edge computing environments in resource-constrained settings where energy efficiency and cost-effectiveness are paramount. A major challenge in Hadoop scheduling is load balancing, as frugal nodes within the cluster can become overwhelmed, resulting in degraded performance and frequent occurrences of out-of-memory errors, ultimately leading to job failures. In this study, we introduce an Adaptive Multi-criteria Selection for Efficient Resource Allocation (AMS-ERA) in Frugal Heterogeneous Hadoop Clusters. Our criterion considers CPU, memory, and disk requirements for jobs and aligns the requirements with available resources in the cluster for optimal resource allocation. To validate our approach, we deploy a heterogeneous SBC-based cluster consisting of 11 SBC nodes and conduct several experiments to evaluate the performance using Hadoop wordcount and terasort benchmark for various workload settings. The results are compared to the Hadoop-Fair, FOG, and IDaPS scheduling strategies. Our results demonstrate a significant improvement in performance with the proposed AMS-ERA, reducing execution time by 27.2%, 17.4%, and 7.6%, respectively, using terasort and wordcount benchmarks.
“…Shwe et al [21] analyzed the efficacy of SBC-based clusters in three application scenarios. This work compares big data processing platforms across three computing paradigms-batch, stream, and function processing-in resource-constrained environments such as edge and fog computing, versus traditional cloud deployments.…”
Section: Sbc In Cloud Edge Clustersmentioning
confidence: 99%
“…The above-mentioned works highlight the proposition of deploying Hadoop clusters in edge environments with SBCs like Raspberry Pi as a viable option [19][20][21][23][24][25][26][27], driven by cost-effectiveness, energy efficiency, sustainability, and flexibility. Although it may require addressing certain challenges, the benefits in terms of reduced latency, scalability, and sustainability make it a compelling choice for many edge and remote scenarios.…”
Efficient resource allocation is crucial in clusters with frugal Single-Board Computers (SBCs) possessing limited computational resources. These clusters are increasingly being deployed in edge computing environments in resource-constrained settings where energy efficiency and cost-effectiveness are paramount. A major challenge in Hadoop scheduling is load balancing, as frugal nodes within the cluster can become overwhelmed, resulting in degraded performance and frequent occurrences of out-of-memory errors, ultimately leading to job failures. In this study, we introduce an Adaptive Multi-criteria Selection for Efficient Resource Allocation (AMS-ERA) in Frugal Heterogeneous Hadoop Clusters. Our criterion considers CPU, memory, and disk requirements for jobs and aligns the requirements with available resources in the cluster for optimal resource allocation. To validate our approach, we deploy a heterogeneous SBC-based cluster consisting of 11 SBC nodes and conduct several experiments to evaluate the performance using Hadoop wordcount and terasort benchmark for various workload settings. The results are compared to the Hadoop-Fair, FOG, and IDaPS scheduling strategies. Our results demonstrate a significant improvement in performance with the proposed AMS-ERA, reducing execution time by 27.2%, 17.4%, and 7.6%, respectively, using terasort and wordcount benchmarks.
“…al. [22] analyzed the efficacy of SBC based clusters in three application scenarios. This work compares big data processing platforms across three computing paradigms-batch, stream, and function processing-in resource-constrained environments such as edge and fog computing, versus traditional cloud deployments.…”
Section: Sbc In Cloud Edge Clustersmentioning
confidence: 99%
“…The above-mentioned works highlight the proposition of deploying Hadoop clusters in edge environments with SBCs like Raspberry Pi is a viable option [20][21][22][23][24][25][26][27][28], driven by cost-effectiveness, energy efficiency, sustainability and flexibility. Although it may require addressing certain challenges, the benefits in terms of reduced latency, scalability, and sustainability make it a compelling choice for many edge and remote scenarios.…”
Efficient resource allocation is crucial in clusters with frugal Single-Board Computers (SBCs) possessing limited computational resources. These clusters are increasingly being deployed in edge computing environments in resource-constrained settings where energy efficiency and cost-effectiveness are paramount. A major challenge in Hadoop YARN scheduling is load-balancing, as frugal nodes within the cluster can become overwhelmed, resulting in degraded performance and frequent occurrences of out-of-memory errors, ultimately leading to job failures. In this study, we introduce an Adaptive Multi-criteria Selection for Efficient Resource Allocation (AMS-ERA) in Frugal Heterogeneous Hadoop Clusters. Our criterion considers CPU, memory and disk requirements for jobs and aligns the requirements with available resources in the cluster for optimal resource allocation. To validate our approach, we deploy a heterogeneous SBC-based cluster consisting of 11 SBC nodes and conduct several experiments to evaluate the performance using Hadoop wordcount and terasort benchmark for various workload settings. The results are compared to the Hadoop-Fair, FOG and IDaPS scheduling strategies. Our results demonstrate a significant improvement in performance with the proposed AMS-ERA, reducing execution time by 27.2%, 17.4% and 7.6% respectively using terasort and wordcount benchmarks.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.