2022
DOI: 10.4314/dujopas.v8i3b.10
|View full text |Cite
|
Sign up to set email alerts
|

Evaluation of Open-Source Tools for Big Data Processing

Abstract: Every day, large terabytes of data repository are being generated which comes mostly from modern information systems, new technologies, Internet of Things (IoT) and cloud computing. With the ever-expanding number of alternatives, the choice of picking machine learning tools for big data to analyse such volume of massive data can be difficult and so necessitates exertions at various stages to excerpt information meant for decision making. As big data analysis is currently the latest researchable area of interes… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
2
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(2 citation statements)
references
References 3 publications
(4 reference statements)
0
2
0
Order By: Relevance
“…It offers a fast in-memory data processing engine that can handle a wide variety of workloads from batch processing to real-time streaming, machine learning, and graph processing [16,20,22]. Spark is based on a resilient distributed dataset (RDD) abstraction model which is an immutable collection of records partitioned across several nodes [19,23,24].…”
Section: Apache Sparkmentioning
confidence: 99%
See 1 more Smart Citation
“…It offers a fast in-memory data processing engine that can handle a wide variety of workloads from batch processing to real-time streaming, machine learning, and graph processing [16,20,22]. Spark is based on a resilient distributed dataset (RDD) abstraction model which is an immutable collection of records partitioned across several nodes [19,23,24].…”
Section: Apache Sparkmentioning
confidence: 99%
“…Aside from being very fast and versatile, another big factor that influences Spark's popularity is its support for some of the most popular programming languages in the world, Python, Java, and Scala, by offering corresponding APIs [24,30,31]. Additionally, it offers SQL and DataFrame APIs.…”
Section: Benefits Of Apache Sparkmentioning
confidence: 99%