2013 IEEE International Congress on Big Data 2013
DOI: 10.1109/bigdata.congress.2013.67
|View full text |Cite
|
Sign up to set email alerts
|

RABID -- A General Distributed R Processing Framework Targeting Large Data-Set Problems

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2014
2014
2022
2022

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 9 publications
(4 citation statements)
references
References 2 publications
0
4
0
Order By: Relevance
“…Many emerging parallel R packages, such as RHIPE, SparkR [17], RABID [18], Snowfall, Rmpi and pbdMPI [19], can be used to parallelize R processes. RHIPE is a Hadoop MapReduce based R package that transforms R functions into MapReduce jobs.…”
Section: Methodsmentioning
confidence: 99%
“…Many emerging parallel R packages, such as RHIPE, SparkR [17], RABID [18], Snowfall, Rmpi and pbdMPI [19], can be used to parallelize R processes. RHIPE is a Hadoop MapReduce based R package that transforms R functions into MapReduce jobs.…”
Section: Methodsmentioning
confidence: 99%
“…R demonstrates superiority in statistical computing, graphical plotting and data analysis compared with graphical user interface (GUI) software. Moreover, R excels in big data analysis [76][77][78], data mining [79] and visualisation [80] modelling, plotting and image processing, which are supported through the variety of its built-in packages.…”
Section: Related Workmentioning
confidence: 99%
“…A number of academic (Ricardo [13], RHIPE [17], RABID [19]) and commercial (RHadoop [5], BigR [33]) projects have looked at integrating R with Apache Hadoop. SparkR follows a similar approach but inherits the functionality [23] and performance [3] benefits of using Spark as the execution engine.…”
Section: Related Workmentioning
confidence: 99%
“…However, data analysis using R is limited by the amount of memory available on a single machine and further as R is single threaded it is often impractical to use R on large datasets. Prior research has addressed some of these limitations through better I/O support [35], integration with Hadoop [13,19] and by designing distributed R runtimes [28] that can be integrated with DBMS engines [25].…”
Section: Introductionmentioning
confidence: 99%