Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 2014
DOI: 10.1145/2623330.2623725
|View full text |Cite
|
Sign up to set email alerts
|

Representative clustering of uncertain data

Abstract: This paper targets the problem of computing meaningful clusterings from uncertain data sets. Existing methods for clustering uncertain data compute a single clustering without any indication of its quality and reliability; thus, decisions based on their results are questionable. In this paper, we describe a framework, based on possible-worlds semantics; when applied on an uncertain dataset, it computes a set of representative clusterings, each of which has a probabilistic guarantee not to exceed some maximum d… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
9
0

Year Published

2015
2015
2024
2024

Publication Types

Select...
5
3
1

Relationship

1
8

Authors

Journals

citations
Cited by 27 publications
(9 citation statements)
references
References 51 publications
0
9
0
Order By: Relevance
“…As experimental results verify in Figure 6a,b, the above properties have a positive impact on vulnerability which is pair-wise preserved, showing that the clusters occur within cluster space-time similarity. The authors in [22] provide an efficient scheme for representative clustering on uncertain data. Finally, assuming feature suppression, the method with clustering demonstrates higher robustness or lower vulnerability, which is the main issue in k-anonymity, and thus in privacy preservation.…”
Section: Discussionmentioning
confidence: 99%
“…As experimental results verify in Figure 6a,b, the above properties have a positive impact on vulnerability which is pair-wise preserved, showing that the clusters occur within cluster space-time similarity. The authors in [22] provide an efficient scheme for representative clustering on uncertain data. Finally, assuming feature suppression, the method with clustering demonstrates higher robustness or lower vulnerability, which is the main issue in k-anonymity, and thus in privacy preservation.…”
Section: Discussionmentioning
confidence: 99%
“…There is no uncertainties in the original UCR datasets, we need to construct uncertainties based on the method mentioned in [14] first. The uncertainties can be described by the samples representing the possible values, so we choose Gaussian distribution to generate samples for each object in dataset .…”
Section: Methodsmentioning
confidence: 99%
“…For many application domains, the ability to unearth valuable knowledge from a dataset is impaired by unreliable, erroneous, obsolete, imprecise, and noisy data (Schubert et al 2015;Züfle et al 2014)-or, in other words, uncertain data that is commonly described by a probability distribution (Jiang et al 2013;Pei et al 2007). Uncertain data are found in modeling situations where a mathematical model only approximates the actual nonconforming quality control process.…”
Section: Uncertain Data Clusteringmentioning
confidence: 99%