2007
DOI: 10.1002/sim.3143
|View full text |Cite
|
Sign up to set email alerts
|

INCA: New statistic for estimating the number of clusters and identifying atypical units

Abstract: This paper presents a solution to two problems that arise in the classification of data such as types of tumor, samples of gene expression profiles or general biomedical data. First, to estimate the real number of clusters in a data set and second to decide whether a new unit belongs to one of these previously identified clusters or it is an outlier or atypical unit. We propose a new statistic which allows us to solve these problems. As our approach is based on a measure of distance or dissimilarity between an… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
18
0

Year Published

2009
2009
2017
2017

Publication Types

Select...
5
3

Relationship

1
7

Authors

Journals

citations
Cited by 20 publications
(19 citation statements)
references
References 22 publications
1
18
0
Order By: Relevance
“…That's why a second algorithm, HIPAM IMO , is proposed, where the differences regarding the original HIPAM are even deeper. It incorporates a different criterion: the INCA statistic criterion (Irigoien and Arenas 2008;Arenas and Cuadras 2002;Irigoien, Sierra, and Arenas 2012) to decide the number of child clusters and as a stopping rule. In short, INCA is defined as the probability of properly classified individuals and it is estimated with the following expression:…”
Section: The Hipamanthropom Methodologymentioning
confidence: 99%
“…That's why a second algorithm, HIPAM IMO , is proposed, where the differences regarding the original HIPAM are even deeper. It incorporates a different criterion: the INCA statistic criterion (Irigoien and Arenas 2008;Arenas and Cuadras 2002;Irigoien, Sierra, and Arenas 2012) to decide the number of child clusters and as a stopping rule. In short, INCA is defined as the probability of properly classified individuals and it is estimated with the following expression:…”
Section: The Hipamanthropom Methodologymentioning
confidence: 99%
“…Such an approach was introduced for the first time in [32] in a regression analysis with mixed data. It has been used by other authors as in [33], [34], [35], [36], [37], [38], [39]. This approach requires euclidean distance measures.…”
Section: Fundamentalsmentioning
confidence: 98%
“…However, direct uses of this approach, with this particular name, have not been found in the robotics literature. There are different approaches found in the literature to deal with the typicality problem (Bar-Hen, 2001;Cuadras & Fortiana, 2000;Irigoien & Arenas, 2008;McDonald et al, 1976;Rao, 1962). Some of them are only suitable for normal multivariate data, others are appropriate for any kind of data but are limited to k = 2, being k the number of classes.…”
Section: Literature Reviewmentioning
confidence: 99%
“…However, and in spite of the high diversity of the used methods, to the best of the author's knowledge, neither typicality nor one-class approaches appear in the mapping literature. The approach proposed in this chapter combines the INCA statistic (Irigoien & Arenas, 2008) with the topological properties of the environmental locations considered and thus represents a new approach to tackling the robot mapping problem as a typicality case.…”
Section: Literature Reviewmentioning
confidence: 99%
See 1 more Smart Citation