2020
DOI: 10.1109/tkde.2020.2975652
|View full text |Cite
|
Sign up to set email alerts
|

A Survey on Spark Ecosystem: Big Data Processing Infrastructure, Machine Learning, and Applications

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
21
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
8
1

Relationship

0
9

Authors

Journals

citations
Cited by 35 publications
(21 citation statements)
references
References 72 publications
0
21
0
Order By: Relevance
“…Another challenge is that memory becomes a bottleneck when dealing with a large amount of data. Keeping the data in memory is very expensive, and additional memory resources are required when a massive amount of data must be processed in computation [165].…”
Section: H Distributed Processing Of Ddos Attacks Using Hadoop and Smentioning
confidence: 99%
“…Another challenge is that memory becomes a bottleneck when dealing with a large amount of data. Keeping the data in memory is very expensive, and additional memory resources are required when a massive amount of data must be processed in computation [165].…”
Section: H Distributed Processing Of Ddos Attacks Using Hadoop and Smentioning
confidence: 99%
“…Spark is one of the most commonly used big data computing platforms, it is a parallel computing framework for big data based on memory computing, which can be used to build faster and more efficient big data analysis applications [29]. Fig.…”
Section: Overview Of Spark Computing Frameworkmentioning
confidence: 99%
“…Furthermore, each RDD is divided into several partitions that can be calculated on the various nodes of the cluster, allowing for data distribution and parallelization. In Spark, all functions are carried out on RDDs [24].…”
Section: Related Workmentioning
confidence: 99%
“…In fact, the big satellite image data is incorporated in strips into the RDDs [24]. Additionally, using HDFS the couple key/value can be efficiently generated to the image partitions in order to conceal the remote sensing data heterogeneity.…”
Section: Related Workmentioning
confidence: 99%