DASC: data aware algorithm for scalable clustering

Bhatnagar, Vasudha; Kaur, Sharanjit; Saxena, Rekha; Khanna, Dhriti

doi:10.1007/s10115-016-0958-4

Cited by 3 publications

(1 citation statement)

References 21 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…A semi-supervised clustering method is considered in case of non-dominant attributes are more important for clustering results than the dominant ones. Bhatnagar et al (2017) proposed a data-aware scalable clustering (DASC) algorithm, which is an incremental algorithm inspired by a grid-based stream clustering algorithm called the ExCC, proposed by Bhatnagar, Kaur and Chakravarthy (2013). This method is able to handle three V's of big data, working with: i) distributed and large files (volume); ii) data stream (velocity), and iii) mixed-type attributes (variety).…”

Section: Discussionmentioning

confidence: 99%

Contributions to the mixed data clustering problem: from a conceptual codification and classification proposal to the usage of optimization methods

Fróes¹

View full text Add to dashboard Cite

This thesis is dedicated to my parents, Andrea and Antonio, my sister Thais, my brother Rafael, my husband Thiago. Each line of this document shows the support I received from all of you and is a way of expressing the love I feel for each one of you."Todo jardim começa com um sonho de amor. Antes que qualquer árvore seja plantada ou qualquer lago seja construído, é preciso que as árvores e os lagos tenham nascido dentro da alma. Quem não tem jardins por dentro, não planta jardins por fora e nem passeia por eles..." Rubem Alves

show abstract

Section: Discussionmentioning

confidence: 99%

Contributions to the mixed data clustering problem: from a conceptual codification and classification proposal to the usage of optimization methods

Fróes¹

View full text Add to dashboard Cite

show abstract

An Effective Analysis of Data Clustering using Distance-based K- Means Algorithm

Ramkumar¹,

Kalamani²,

C³

et al. 2021

J. Phys.: Conf. Ser.

View full text Add to dashboard Cite

Real-world data sets are regularly provides different and complementary features of information in an unsupervised way. Different types of algorithms have been proposed recently in the genre of cluster analysis. It is arduous to the user to determine well in advance which algorithm would be the most suitable for a given dataset. Techniques with respect to graphs are provides excellent results for this task. However, the existing techniques are easily vulnerable to outliers and noises with limited idea about edges comprised in the tree to divide a dataset. Thus, in some fields, the necessacity for better clustering algorithms it uses robust and dynamic methods to improve and simplify the entire process of data clustering has become an important research field. In this paper, a novel distance-based clustering algorithm called the entropic distance based K-means clustering algorithm (EDBK) is proposed to eradicate the outliers in effective way. This algorithm depends on the entropic distance between attributes of data points and some basic mathematical statistics operations. In this work, experiments are carry out using UCI datasets showed that EDBK method which outperforms the existing methods such as Artificial Bee Colony (ABC), k-means.

show abstract

Efficient Data Stream Clustering With Sliding Windows Based on Locality-Sensitive Hashing

2018

View full text Add to dashboard Cite

DASC: data aware algorithm for scalable clustering

Cited by 3 publications

References 21 publications

Contributions to the mixed data clustering problem: from a conceptual codification and classification proposal to the usage of optimization methods

Contributions to the mixed data clustering problem: from a conceptual codification and classification proposal to the usage of optimization methods

An Effective Analysis of Data Clustering using Distance-based K- Means Algorithm

Efficient Data Stream Clustering With Sliding Windows Based on Locality-Sensitive Hashing

Contact Info

Product

Resources

About