2011
DOI: 10.1016/j.jpdc.2010.08.013
|View full text |Cite
|
Sign up to set email alerts
|

Transparent runtime parallelization of the R scripting language

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
7
0

Year Published

2012
2012
2019
2019

Publication Types

Select...
3
2
2

Relationship

0
7

Authors

Journals

citations
Cited by 11 publications
(7 citation statements)
references
References 26 publications
0
7
0
Order By: Relevance
“…That could explain why SCBI MapReduce skeleton shows a speed-up of 31-fold for 32 cores and 59-fold for 64 cores, even with sequence data (Figure 2(a)). This performance is better than the one displayed by the R package pR, where 32 cores provide speedups of 20-27-fold, depending on the process [25]. Several design reasons can also be invoked to explain such an efficiency [34]: (i) disk I/O operations are reduced to minimum (data are read only at the beginning and results are saved only at the end); (ii) absence of asymmetry impact (Figure 1(b)); (iii) the manager overhead is limited when using more than 2 cores and chunks of sequences (Tables 2 and 3); and (iv) longer tasks increased the efficiency because the manager is on standby most of the time, while waiting for the workers to finish, avoiding relaunching of internal or external programs for brief executions.…”
Section: Scbi Mapreduce Is An Efficient Task-farm Skeletonmentioning
confidence: 79%
See 2 more Smart Citations
“…That could explain why SCBI MapReduce skeleton shows a speed-up of 31-fold for 32 cores and 59-fold for 64 cores, even with sequence data (Figure 2(a)). This performance is better than the one displayed by the R package pR, where 32 cores provide speedups of 20-27-fold, depending on the process [25]. Several design reasons can also be invoked to explain such an efficiency [34]: (i) disk I/O operations are reduced to minimum (data are read only at the beginning and results are saved only at the end); (ii) absence of asymmetry impact (Figure 1(b)); (iii) the manager overhead is limited when using more than 2 cores and chunks of sequences (Tables 2 and 3); and (iv) longer tasks increased the efficiency because the manager is on standby most of the time, while waiting for the workers to finish, avoiding relaunching of internal or external programs for brief executions.…”
Section: Scbi Mapreduce Is An Efficient Task-farm Skeletonmentioning
confidence: 79%
“…Parallelisation libraries for R language, besides Rmpi, are SPRINT [24] and pR [25] packages, whose their main advantage is that they require very little modification to the existing sequential R scripts and no expertise in parallel computing; however, the master worker suffers from communication overhead, and the authors recognise that their approach may not yield the optimal schedule [25]. Other parallelisation libraries are snow and nws that provide coordination and parallel execution facilities.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Neither approach involves compiler manipulation, thus differing from our approach. The recent work of Li et al [16] on the scripting array language R is also on parallelizing its run-time routines but it did use sophisticated compiler technology to do this.…”
Section: Discussionmentioning
confidence: 99%
“…Targeting graph mining, PEGASUS [15] implements generalized iterated matrix-vector multiply efficiently on Hadoop. RIOT [24] and RevoScaleR focus on making statistical computing workloads in R I/O-efficient; pR [16] automatically parallelizes function calls and loops in R. Pig [17], Hive [20], and SciHadoop [4] are examples of higher-level languages and execution plan generators for MapReduce systems. Our work goes beyond these systems by addressing important usability issues of automatic hardware provisioning and configuration.…”
Section: Related Workmentioning
confidence: 99%