2010 Fifth International Conference on Digital Information Management (ICDIM) 2010
DOI: 10.1109/icdim.2010.5664691
|View full text |Cite
|
Sign up to set email alerts
|

Clustering approaches for data with missing values: Comparison and evaluation

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
19
0
1

Year Published

2013
2013
2023
2023

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 23 publications
(20 citation statements)
references
References 8 publications
0
19
0
1
Order By: Relevance
“…PDS The Partial Distance Strategy [10,15]. Calculate the sum of squared differences of the mutually known components and scale to the missing components:…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…PDS The Partial Distance Strategy [10,15]. Calculate the sum of squared differences of the mutually known components and scale to the missing components:…”
Section: Methodsmentioning
confidence: 99%
“…A simple and widely used method for estimating distance with missing values is the Partial Distance Strategy (PDS) [10,15]. In the PDS, an estimate for the squared distance is found by calculating the sum of squared differences of the mutually known components, and scaling the value proportionally to account for the missing values.…”
Section: Related Workmentioning
confidence: 99%
“…Conrad and Himmelspach [1] concluded that the missing values are considered MCAR, if the absence of data does not rely on data values in the data matrix that are experimented. It is believed that missing data cause biased result when performing data mining tasks because missing data are still considered as valuable representative attributes with respect to the hidden information in data sets.…”
Section: Introductionmentioning
confidence: 99%
“…Grouping data were available for 26 interviews. Usually free pile-sorting data are analysed using cluster analysis [Bernard, 2006]; however, these analyses are sensitive to missing values [Himmelspach and Conrad, 2010], which in our case were generated by the exclusion of non-recognized animals from each free pile-sorting task. A large number of clustering algorithms that can deal with missing values has been proposed [Kaufman and Rousseeuw, 2005], and research in this field remains active.…”
Section: Identification Test and Species Groupingsmentioning
confidence: 99%
“…Folk taxonomies are usually derived using pile sort data [López et al, 1997;Koster et al, 2010;Papworth et al, 2013], whereby participants are asked to place a set number of taxa into groups which they perceive as containing animals that are similar. One of the caveats of the most commonly used pile sort analyses is that they are sensitive to missing data [Himmelspach and Conrad, 2010], and therefore require the researcher to use only species that will be recognized by the majority of participants, or will ask participants to sort animals which they do not recognize. The latter may be particularly problematic if the folk taxonomy of the community is based upon non-morphological characters, such as diet or the time of day during which a particular animal is active.…”
Section: Introductionmentioning
confidence: 99%