Clustering Mixed Data Based on Evidence Accumulation

Luo, Huilan; Kong, Fanrong; Li, Yixiao

doi:10.1007/11811305_38

Cited by 26 publications

(11 citation statements)

References 9 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…For scalable clustering of mixed data, orthogonal partitioning clustering algorithm [13] was introduced which was later extended by the authors in [14] for the purpose of clustering large databases with numeric and nominal values using orthogonal projections. To achieve a similar objective, a fuzzy clustering algorithm [15] based on probabilistic distance feature, an agglomerative algorithm based on distinctness heuristics as well as the Evidence Based Spectral Clustering (EBSC) algorithm [16] based on evidence accumulation were introduced in the recent past. On the other hand, authors in [17] introduced three different distance measure functions based on Mahalanobis-type distance measure for the efficient analysis of mixed data.…”

Section: Related Workmentioning

confidence: 99%

Multi Level Mining of Warehouse Schema

Usman

Pears

2011

Networked Digital Technologies

View full text Add to dashboard Cite

Abstract. The two mature disciplines, namely Data Mining and Data Warehousing have broadly the same set of objectives. Yet, they have developed largely separate from each other resulting in different techniques being used in each discipline. It has been recognized that mining techniques developed for pattern recognition such as Clustering and Visualization can assist in designing data warehouse schema. However, a suitable methodology is required for the seamless integration of mining methods in the design of warehouse schema. In previous work, we presented a methodology that employs hierarchical clustering to derive a tree structure that can be used by a data warehouse designer to build a schema. We believe that, in order to strengthen the decision making process, there is a strong need for a method that automatically extracts knowledge present at different levels of abstraction from a warehouse. We demonstrate with examples how mining at different levels of a hierarchical warehouse schema can give new insights about the underlying data cluster which not only helps in building more meaningful dimensions and facts for data warehouse design but can also improve the decision making process.

show abstract

Section: Related Workmentioning

confidence: 99%

Multi Level Mining of Warehouse Schema

Usman

Pears

2011

Networked Digital Technologies

View full text Add to dashboard Cite

show abstract

“…However, doing this does not reveal the original similarity structure of the data sets. In [27], an Evidence-Based Spectral Clustering algorithm is proposed for mixed data by integrating the evidence-based similarity measure into spectral clustering structure. The algorithm [28] assumes a classical finite mixture distribution model on mixed data and utilizes a Bayesian model to derive the most probable class distribution for the data given with prior information.…”

Section: Related Workmentioning

confidence: 99%

An equi-biased k-prototypes algorithm for clustering mixed-type data

Sangam

2018

Sādhanā

View full text Add to dashboard Cite

Clustering has been recognized as a very important approach for data analysis that partitions the data according to some (dis)similarity criterion. In recent years, the problem of clustering mixed-type data has attracted many researchers. The k-prototypes algorithm is well known for its scalability in this respect. In this paper, the limitations of dissimilarity coefficient used in the k-prototypes algorithm are discussed with some illustrative examples. We propose a new hybrid dissimilarity coefficient for k-prototypes algorithm, which can be applied to the data with numerical, categorical and mixed attributes. Besides retaining the scalability of the kprototypes algorithm in our method, the dissimilarity functions for either-type attributes are defined on the same scale with respect to their dimensionality, which is very beneficial to improve the efficiency of clustering result. The efficacy of our method is shown by experiments on real and synthetic data sets.

show abstract

“…Improved k-prototypes [16] can cluster incomplete mixed-type data directly and eliminate the sensitivity of initial prototypes. Evidence-Based Spectral Clustering algorithm [17] integrates the spectral clustering frame and evidence-based similarity computation method to cluster mixed-type data. Moreover, some similarity or dissimilarity of mixed-type data was proposed.…”

Section: Mixed-type Data Clustering Algorithmsmentioning

confidence: 99%

Auto Insurance Business Analytics Approach for Customer Segmentation Using Multiple Mixed-Type Data Clustering Algorithms

Zhuang

Gao

2018

Teh. vjesn.

View full text Add to dashboard Cite

Customer segmentation is critical for auto insurance companies to gain competitive advantage by mining useful customer related information. While some efforts have been made for customer segmentation to support auto insurance decision making, their customer segmentation results tend to be affected by the characteristics of the algorithm used and lack multiple validation from multiple algorithms. To this end, we propose an auto insurance business analytics approach that segments customers by using three mixed-type data clustering algorithms including k-prototypes, improved k-prototypes and similarity-based agglomerative clustering. The customer segmentation results of these algorithms can complement and reinforce each other and demonstrate as much information as possible to support decision-making. To confirm its practical value, the proposed approach extracts seven rules for an auto insurance company that may support the company to make customer related decisions and develop insurance products.

show abstract

Clustering Mixed Data Based on Evidence Accumulation

Cited by 26 publications

References 9 publications

Multi Level Mining of Warehouse Schema

Multi Level Mining of Warehouse Schema

An equi-biased k-prototypes algorithm for clustering mixed-type data

Auto Insurance Business Analytics Approach for Customer Segmentation Using Multiple Mixed-Type Data Clustering Algorithms

Contact Info

Product

Resources

About