2012 IEEE 12th International Conference on Data Mining 2012
DOI: 10.1109/icdm.2012.155
|View full text |Cite
|
Sign up to set email alerts
|

Parallelization with Multiplicative Algorithms for Big Data Mining

Abstract: We propose a nontrivial strategy to parallelize a series of data mining and machine learning problems, including 1-class and 2-class support vector machines, nonnegative least square problems, and ℓ1 regularized regression (LASSO) problems. Our strategy fortunately leads to extremely simple multiplicative algorithms which can be straightforwardly implemented in parallel computational environments, such as MapReduce, or CUDA. We provide rigorous analysis of the correctness and convergence of the algorithm. We d… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
6
0

Year Published

2014
2014
2019
2019

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 17 publications
(6 citation statements)
references
References 25 publications
0
6
0
Order By: Relevance
“…The interface e®ectively performs parallel execution of data mining algorithms such as k-means clustering, principal component analysis and linear regression based on Map-Reduce programming model. The problem of parallelising data mining and machine learning algorithms for handling big data sources has been tackled by Luo et al (2012) since the task of parallelisation is non-trivial. They proposed a strategy to parallelise series of data mining algorithms such as support vector machine and linear regression models using Map-Reduce programming models.…”
Section: Research Attempts Based On Local Pattern Analytics Strategymentioning
confidence: 99%
“…The interface e®ectively performs parallel execution of data mining algorithms such as k-means clustering, principal component analysis and linear regression based on Map-Reduce programming model. The problem of parallelising data mining and machine learning algorithms for handling big data sources has been tackled by Luo et al (2012) since the task of parallelisation is non-trivial. They proposed a strategy to parallelise series of data mining algorithms such as support vector machine and linear regression models using Map-Reduce programming models.…”
Section: Research Attempts Based On Local Pattern Analytics Strategymentioning
confidence: 99%
“…MapReduce is a parallelizable data process framework aiming to provide a generic method to processing data on cluster or a grid. It has been used in many different areas such as graph pattern analysis [19,20], itemset mining [21], support vector machine [22] and also sequential pattern mining [23]. To make full use of computing resources and storage resources in cluster, HDFS is always used to store data files as multiple copies, so that MapReduce can take advantage of locality of data, and decrease data transmission time.…”
Section: Mapreducementioning
confidence: 99%
“…However, it is unable to "live in harmony" with the power grid", causing serious abandonment in the wind and solar energy [2], reducing the utilization efficiency of new energy based on wind and solar energy, which is seriously inconsistent the China source strategy of green, environmental protection and sustainable development [3]. In twenty-first Century, with the concept of big data and cloud computing put forward, the" energy revolution" of mankind has been greatly impacted [4][5][6].…”
Section: Introductionmentioning
confidence: 99%