DBDC:Dens i ty Bas ed D i str i b uted C l uste r ingE s h r ef Jan uza j ,Ha n s -P e t e r K r iegel,and Martin P feifle U niv e rsi ty of M u nic h ,In sti tut efo r C omp ute r Scien c e h ttp : //www.db s .infor m a t ik.u ni-mu enc hen.de { j a n uza j , k r iegel, pfeifle}@infor m a t ik.u ni-mu enc hen.de Abstr act . C l ust e r ingha s b e c ome a nin c r e a s ingly importa n t t a s kinmode r n a pplicat iondoma ins suc h a s m a r ket ing a nd p urc h a s ing a ssi sta n c e , m u l t imedi a, molec u l a r b iology a s w ell a s m a n y o t hers. I nmo st of thes e a r e a s , the d a t aar eor igina lly c ollec t ed a t diffe r ent s i t e s . I nor der t oextr act infor m a t ionfr om thes eda t a, t hey a r eme r ged a t acentra l si t e a nd then c l ust e r ed. I n this p a per , w ep r opo s e a diffe r ent a ppr o ach. W e c l ust e r t he d a t a locally a nd e xtr act sui t able repr e s ent a t i v e s f r om t hes e c l ust e rs. T hes e r epr e s ent a t i v e s a r e sentto a glo bal s e rve r s i t e wher e we re sto r e t he c omple t e c l ust e ring bas ed on the local r epr e s ent a t i v e s . T his a ppr o achisve ry effic ient ,be cause the local c l ust e r ing can b e carriedo ut q u i c kly a nd independent l y f r om e acho t her . F urt her mor e , weha v elo w tra n s mission c o st,as t he n u m b e r of tra n s mitted repr e s enta t i v e s i s m u c h s m a lle r t h a n the car din a lity of t he c omple t ed a t a s e t . Bas ed on t his s m a ll n u m b e r of repr e s ent a t i v e s , the global c l ust e r ing can b edone v e ry effi c ien t l y . F o r b o t h the local a nd the glo bal c l ust e r ing , w e us e a den s i ty bas ed c l ust e r ing a lgor i t hm. T he c omb ina t iono f b o t h the local a nd the glo bal c l ust e r ingf o r m s o ur new DBDC ( D ens i ty Bas ed D i str i b uted C l uste r ing) a lgo r i t hm. F urt her mor e , wed i s c uss t he c omple x p r o b lemoffinding a su i t able q u a lity mea sur efo r e v a l u a t ingdi str i b uted c l ust e r ing s . W ein trodu c e tw oqu a lity c r i t e r i a whic h a r e c omp a r ed toeachot her a nd w hic h a llowus t oe v a l u a t e the q u a lity of o ur DBDC a lgo r i t hm. I nour e x per iment a l e v a l u a t ion , w e will showth a tw edono t h a v e to sacr ific e c l ust e r ingqu a lity in o r der t oga in a neffi c ien c y a d v a n t a ge when us ingour distr i b uted c l ust e r ing a ppr o ach. I n trodu c t ionK now ledge D i s c o v e ry in Dat aba s e s ( KDD) tr ies t oiden t ifyva lid, nov el, pot ent i a lly usefu l , a nd ul t ima t ely u nde rst a ndable p a tte r n s in d a t a . T r a dit ion a l KDDa pplicat ion s r equ i r efu ll acc e ss t o the d a t a whic his going to b e a n a l yzed. A ll d a t a h a s t o b elo cat ed a t t h a t s i t e wher e i t i ssc rut ini z ed. N o w a d a ys, l a r ge a mou n ts of het e r ogeneous,comple x d a t a re s ideondifferent , independen t l y w o r king c omp ute rs whic h a r e c onnec t ed toeachot her v i a localor w ...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.