2019
DOI: 10.1186/s13174-019-0118-7
|View full text |Cite
|
Sign up to set email alerts
|

Upgrading a high performance computing environment for massive data processing

Abstract: High-performance computing (HPC) and massive data processing (Big Data) are two trends that are beginning to converge. In that process, aspects of hardware architectures, systems support and programming paradigms are being revisited from both perspectives. This paper presents our experience on this path of convergence with the proposal of a framework that addresses some of the programming issues derived from such integration. Our contribution is the development of an integrated environment that integretes (i) … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 7 publications
(6 citation statements)
references
References 23 publications
(33 reference statements)
0
5
0
Order By: Relevance
“…Observations show that ants leave a secretion during their movement, and ants behind them can make a biased path choice based on the secretion left by the ants in front. is constitutes a positive feedback mechanism for learning information, and ants seek the shortest path to food through this information exchange [18,19].…”
Section: Basic Principlesmentioning
confidence: 99%
“…Observations show that ants leave a secretion during their movement, and ants behind them can make a biased path choice based on the secretion left by the ants in front. is constitutes a positive feedback mechanism for learning information, and ants seek the shortest path to food through this information exchange [18,19].…”
Section: Basic Principlesmentioning
confidence: 99%
“…The load balancing methods [10] are used to distribute tasks and loads between nodes in the system. These methods help to improve resource utilization, shorten response times, and enable flexible scalability, which is an essential part of cloud computing [11,12].…”
Section: Related Work and Motivationsmentioning
confidence: 99%
“…In that way, different resources will be used to execute, in parallel, the different tasks. COMPSs is able to transfer, transparently, files that are used as input parameter for a task; it also supports shared disks or HDFS (by using a connector [27]) to speed up the read step. As an example, Fig.…”
Section: The Compss Frameworkmentioning
confidence: 99%
“…Recent frameworks, such as COMPSs and Spark, implement different schedulers to explore data locality and other policies transparently to users. In addition, distributed storage systems (e.g., HDFS, Cassandra, Hive, among others) that are supported in many frameworks like Spark (natively) or COMPSs (through an API [27]) can help the schedulers by increasing the possibilities of improving data locality when reading files. Frameworks that use a conventional file system typically adopt as a rule that input files will be located on the master computer and will be transferred over the network when required by a task.…”
Section: Exploring Data Localitymentioning
confidence: 99%