2014
DOI: 10.1007/s11227-014-1151-8
|View full text |Cite
|
Sign up to set email alerts
|

Scalable CAIM discretization on multiple GPUs using concurrent kernels

Abstract: CAIM (Class-Attribute Interdependence Maximization) is one of the stateof-the-art algorithms for discretizing data for which classes are known. However, it may take a long time when run on high-dimensional large-scale data, with large number of attributes and/or instances. This paper presents a solution to this problem by introducing a GPU-based implementation of the CAIM algorithm that significantly speeds up the discretization process on big complex data sets. The GPU-based implementation is scalable to mult… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2015
2015
2020
2020

Publication Types

Select...
4
2

Relationship

2
4

Authors

Journals

citations
Cited by 7 publications
(3 citation statements)
references
References 39 publications
0
3
0
Order By: Relevance
“…Some works have tried to deal with large‐scale discretization. For example, in Ref , the authors proposed a scalable implementation of Class‐Attribute Interdependence Maximization algorithm by using GPU technology. In Ref , a discretizer based on windowing and hierarchical clustering is proposed to improve the performance of classical tree‐based classifiers.…”
Section: Taxonomymentioning
confidence: 99%
“…Some works have tried to deal with large‐scale discretization. For example, in Ref , the authors proposed a scalable implementation of Class‐Attribute Interdependence Maximization algorithm by using GPU technology. In Ref , a discretizer based on windowing and hierarchical clustering is proposed to improve the performance of classical tree‐based classifiers.…”
Section: Taxonomymentioning
confidence: 99%
“…They offer higher scalability to big data problems for a fraction of the cost of a traditional mainframe solution. GPUs are particularly efficient for streaming environments and provide a very fast decision with minimum label latency [22][23][24][25][26][27]. However, they are often associated with a more difficult code implementation and limited memory, which makes it difficult to scale to true big data problems.…”
Section: Data Stream Mining For Online Learningmentioning
confidence: 99%
“…Therefore, it can easily scale to large problems. Moreover, discretization of multiple attributes can be parallelized using CPU threads or GPUs [7]. Table 13 shows the discretization time for the datasets.…”
Section: Space and Time Complexitymentioning
confidence: 99%