Proceedings of the 4th Annual Symposium on Cloud Computing 2013
DOI: 10.1145/2523616.2525952
|View full text |Cite
|
Sign up to set email alerts
|

High performance clustering of social images in a map-collective programming model

Abstract: Large-scale iterative computations are common in many important data mining and machine learning algorithms needed in analytics and deep learning. In most of these applications, individual iterations can be specified as MapReduce computations, leading to the Iterative MapReduce programming model for efficient execution of data-intensive iterative computations interoperably between HPC and cloud environments. Further one needs additional communication patterns from those familiar in MapReduce and we base our in… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2014
2014
2016
2016

Publication Types

Select...
4

Relationship

2
2

Authors

Journals

citations
Cited by 4 publications
(3 citation statements)
references
References 16 publications
0
3
0
Order By: Relevance
“…Previous work shows many machine learning algorithms can be implemented in the MapReduce paradigm [7]; later on, model communication is improved by collective communication operations in iterative MapReduce [12,19,6]. How- [15] CGS C PowerGraph LDA [2] CGS C Yahoo!…”
Section: Discussionmentioning
confidence: 99%
“…Previous work shows many machine learning algorithms can be implemented in the MapReduce paradigm [7]; later on, model communication is improved by collective communication operations in iterative MapReduce [12,19,6]. How- [15] CGS C PowerGraph LDA [2] CGS C Yahoo!…”
Section: Discussionmentioning
confidence: 99%
“…We categorize these into three [32]; for (B) an MPIbased K-Means implementation [51]. We examine the following hybrid approaches: (C.1) Python Scripting implementation using Pilots [8] (Pilot-KMeans), (C.2) a Spark K-Means [52] and (C.3) a HARP implementation [50]. HARP introduces an abstraction for collective operations within Hadoop jobs [50].…”
Section: High-performance Big Data Stack: a Convergence Of Paradigms?mentioning
confidence: 99%
“…We examine the following hybrid approaches: (C.1) Python Scripting implementation using Pilots [8] (Pilot-KMeans), (C.2) a Spark K-Means [52] and (C.3) a HARP implementation [50]. HARP introduces an abstraction for collective operations within Hadoop jobs [50]. While (C.1) provides an interoperable implementation of the MapReduce programming model for HPC environments, (C.2) and (C.3) enhance Hadoop for efficient iterative computations and introduce collective operations to Hadoop environments.…”
Section: High-performance Big Data Stack: a Convergence Of Paradigms?mentioning
confidence: 99%