2002
DOI: 10.1007/3-540-45706-2_48
|View full text |Cite
|
Sign up to set email alerts
|

Parallel Fuzzy c- Means Clustering for Large Data Sets

Abstract: Abstract. The parallel fuzzy c-means (PFCM) algorithm for clustering large data sets is proposed in this paper. The proposed algorithm is designed to run on parallel computers of the Single Program Multiple Data (SPMD) model type with the Message Passing Interface (MPI). A comparison is made between PFCM and an existing parallel k-means (PKM) algorithm in terms of their parallelisation capability and scalability. In an implementation of PFCM to cluster a large data set from an insurance company, the proposed a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
40
0
3

Year Published

2007
2007
2021
2021

Publication Types

Select...
4
2
2

Relationship

0
8

Authors

Journals

citations
Cited by 94 publications
(43 citation statements)
references
References 10 publications
(19 reference statements)
0
40
0
3
Order By: Relevance
“…FCM finds hyper-spherical clusters and partitions lying in Another approach involves collaborative clustering, where the algorithm runs in every data site and iteratively shares information across sites trying to find a global structure [43,47]. Despite the existence of many algorithms for parallel and distributed data, only a few of them have been developed for fuzzy clustering as a generalization of the original (centralized) ones, e.g., see [48][49][50]. In other words, just a few fuzzy clustering algorithms have been generalized to deal with parallel and distributed data in an exact way.…”
Section: Subsets (Clusters)mentioning
confidence: 99%
See 1 more Smart Citation
“…FCM finds hyper-spherical clusters and partitions lying in Another approach involves collaborative clustering, where the algorithm runs in every data site and iteratively shares information across sites trying to find a global structure [43,47]. Despite the existence of many algorithms for parallel and distributed data, only a few of them have been developed for fuzzy clustering as a generalization of the original (centralized) ones, e.g., see [48][49][50]. In other words, just a few fuzzy clustering algorithms have been generalized to deal with parallel and distributed data in an exact way.…”
Section: Subsets (Clusters)mentioning
confidence: 99%
“…The algorithm called here Distributed Fuzzy c-Means (DFCM) is based on the ideas on the parallelization of computations in the FCM algorithm [48][49][50] and is a formal generalization of FCM to handle distributed data. Note that Algorithm 1 (FCM in Sect.…”
Section: Dfcm: Distributed Fuzzy C-meansmentioning
confidence: 99%
“…The abstraction tree bears some resemblance to the major familiar quad tree data structure [17] used in the several image processing and image analysis algorithms. Clustering is the process of grouping a data set in a way that the similarity between data within a cluster is maximized while the similarity between data of different clusters is maximized [18] and is used for pattern recognition in image processing. To recognize a given pattern in an image various techniques have been utilized, but in general two broad categories of classifications have been made: unsupervised techniques and supervised techniques.…”
Section: Introductionmentioning
confidence: 99%
“…Mais especificamente, alguns foram desenvolvidos como generalizações de versões centralizadas de um algoritmo específico (Olson, 1995;Dhillon & Modha, 2000;Forman & Zhang, 2000;Garg et al, 2006), sendo capazes de produzir os mesmos resultados finais que seriam obtidos pelos respectivos algoritmos originais se estes pudessem ser aplicados aos dados de forma centralizada. Embora existam muitos algoritmos capazes de lidar com dados paralelos e distribuídos, poucos foram desenvolvidos para agrupamento fuzzy de dados como generalizações de versões centralizadas de determinado algoritmo (Kwok et al, 2002;Rahimi et al, 2004;Modenesi et al, 2007). Em outras palavras, poucos algoritmos de agrupamento fuzzy de dados foram generalizados para trabalhar com dados paralelos e distribuídos de forma a produzir os mesmos resultados finais que a versão centralizada de tal algoritmo obteria com os dados centralizados.…”
Section: Generalização Dos Algoritmos Eíndices Estudadosunclassified
“…O algoritmo denominado DFCM (Distributed Fuzzy c-Means -em inglês) foi originalmente proposto no contexto paralelo (Kwok et al, 2002;Rahimi et al, 2004;Modenesi et al, 2007). Este algoritmo consiste na generalização do algoritmo FCM (Seção 2.2.1) para lidar com dados paralelos ou distribuídos.…”
Section: Dfcm: Distributed Fuzzy C-meansunclassified