Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence 2017
DOI: 10.24963/ijcai.2017/302
|View full text |Cite
|
Sign up to set email alerts
|

Affinity Learning for Mixed Data Clustering

Abstract: In this paper, we propose a novel affinity learning based framework for mixed data clustering, which includes: how to process data with mixed-type attributes, how to learn affinities between data points, and how to exploit the learned affinities for clustering. In the proposed framework, each original data attribute is represented with several abstract objects defined according to the specific data type and values. Each attribute value is transformed into the initial affinities between the data point and the a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
2
2
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 8 publications
(3 citation statements)
references
References 4 publications
0
3
0
Order By: Relevance
“…The two latter packages provide both a large number of clustering and cluster stability assessment methods and functions to compute dissimilarity matrices and describe the results). Another known alternative consists of one-hot-encoding categorical data into binary variables and treating the latter as continuous (e.g., Li and Latecki, 2017 ). It is, however, necessary to down-weight the variables obtained, so that no more weight is given to the original variables with more modalities.…”
Section: Statistical Rationale and Literature Review On Data Clusteringmentioning
confidence: 99%
“…The two latter packages provide both a large number of clustering and cluster stability assessment methods and functions to compute dissimilarity matrices and describe the results). Another known alternative consists of one-hot-encoding categorical data into binary variables and treating the latter as continuous (e.g., Li and Latecki, 2017 ). It is, however, necessary to down-weight the variables obtained, so that no more weight is given to the original variables with more modalities.…”
Section: Statistical Rationale and Literature Review On Data Clusteringmentioning
confidence: 99%
“…Another known alternative consists of one-hot-encoding categorical data into binary variables and treating the latter as continuous (e.g. in Li et al 2017). It is however necessary to downweigh the obtained variables so that no more weight is given to the original variables with more modalities.…”
Section: Choosing An Appropriate Clustering Approachmentioning
confidence: 99%
“…Positive unlabelled (PU) inference is based on data sets containing labelled observations (S = 1) which are all positive (Y = 1), and unlabelled ones (S = 0) which may either belong to a positive or a negative class (Y is either 1 or 0). Examples of such experimental setup abound in medicine [36,22,6,38], text and image analysis [9,27,26,15], ecology [37,29] and survey data [33]. For example, medical databases may contain only information about diagnosed patients who have a certain disease (S = 1) whereas un-diagnosed patients (S = 0) may have it or not.…”
Section: Introductionmentioning
confidence: 99%