2015 Seventh International Symposium on Parallel Architectures, Algorithms and Programming (PAAP) 2015
DOI: 10.1109/paap.2015.41
|View full text |Cite
|
Sign up to set email alerts
|

Spark: A Big Data Processing Platform Based on Memory Computing

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
20
0
1

Year Published

2017
2017
2023
2023

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 45 publications
(23 citation statements)
references
References 13 publications
0
20
0
1
Order By: Relevance
“…• Programmable Clusters condition has brought a few difficulties: Firstly, numerous applications should be modified in a parallel way, and the programmable Clusters need to process more sorts of information figuring; Secondly, the adaptation to internal failure of the Clusters is progressively significant and troublesome; Thirdly, Clusters powerfully arrange the registering assets between shared clients, which builds the obstruction of the applications. With the quick increment of utilizations, Clusters figuring requires a working answer for suit various computations [18]. • Common difficulties during information change and highlight extraction include: Taking absolute information, (for example, nation for geolocation or classification for a motion picture) and encoding it in a numerical portrayal.…”
Section: A Challenges In Existing Methodologiesmentioning
confidence: 99%
“…• Programmable Clusters condition has brought a few difficulties: Firstly, numerous applications should be modified in a parallel way, and the programmable Clusters need to process more sorts of information figuring; Secondly, the adaptation to internal failure of the Clusters is progressively significant and troublesome; Thirdly, Clusters powerfully arrange the registering assets between shared clients, which builds the obstruction of the applications. With the quick increment of utilizations, Clusters figuring requires a working answer for suit various computations [18]. • Common difficulties during information change and highlight extraction include: Taking absolute information, (for example, nation for geolocation or classification for a motion picture) and encoding it in a numerical portrayal.…”
Section: A Challenges In Existing Methodologiesmentioning
confidence: 99%
“…Saving input-output and middle data in In-memory as a form of RDD(Resilient Distributed Dataset) facilitates more rapid processing speed because it could show high performance and rapid processing of conversational work road without additional cost or repetition of I/O. [2] Above figure shows a structure of Stack. There are standalone, Scheduler, YARN and Mesos for operating Spark in infraclass.…”
Section: A Apache Flumementioning
confidence: 99%
“…Spark is a new generation of distributed processing framework for big data following Hadoop. It has been rapidly pursued by academia and industry with its advanced design concept.…”
Section: Introductionmentioning
confidence: 99%
“…It has been rapidly pursued by academia and industry with its advanced design concept. It not only efficiently processes a large amount of data from different applications and data sources but also greatly reduces the number of disk I/Os by caching intermediate data of applications in memory and using a more powerful and flexible task scheduling mechanism based on directed acyclic graph (DAG) . Because Spark implements the DAG execution engine, which can efficiently process data streams based on memory, it is 100 times faster in terms of memory‐based operations and 10 times faster in hard disk‐based operations than Hadoop Mapreduce according to the official test results…”
Section: Introductionmentioning
confidence: 99%