2019
DOI: 10.1145/3347711
|View full text |Cite
|
Sign up to set email alerts
|

How Complex Is Your Classification Problem?

Abstract: Characteristics extracted from the training datasets of classification problems have proven to be effective predictors in a number of meta-analyses. Among them, measures of classification complexity can be used to estimate the difficulty in separating the data points into their expected classes. Descriptors of the spatial distribution of the data and estimates of the shape and size of the decision boundary are among the known measures for this characterization. This information can support the formulation of n… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
49
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
4
3
1

Relationship

1
7

Authors

Journals

citations
Cited by 161 publications
(68 citation statements)
references
References 81 publications
0
49
0
Order By: Relevance
“…Alfonso-Reese et al ( 2002) focused on three measures (class separation, covariance complexity, and error rate of the ideal observer). Rosedahl and Ashby (2019) examined these same three measures, plus nine other measures that are popular in the machine-learning literature (described by Lorena, Garcia, Lehmann, Souto, & Ho, 2018), and in addition, they proposed a new measure (i.e., the striatal difficulty measure). Here we tested the predictions of all these measures on RB and II category structures.…”
Section: Difficulty Measuresmentioning
confidence: 99%
See 1 more Smart Citation
“…Alfonso-Reese et al ( 2002) focused on three measures (class separation, covariance complexity, and error rate of the ideal observer). Rosedahl and Ashby (2019) examined these same three measures, plus nine other measures that are popular in the machine-learning literature (described by Lorena, Garcia, Lehmann, Souto, & Ho, 2018), and in addition, they proposed a new measure (i.e., the striatal difficulty measure). Here we tested the predictions of all these measures on RB and II category structures.…”
Section: Difficulty Measuresmentioning
confidence: 99%
“…These 13 alternative measures and AlexNet are described briefly in this section. For more details, see Rosedahl and Ashby (2019), Lorena et al (2018), and Krizhevsky et al (2012). SDM is defined as between-category similarity divided by within-category similarity, where between-category similarity is the summed similarity of all exemplars to all other exemplars belonging to contrasting categories and within-category similarity is the summed similarity of all exemplars to all other exemplars belonging to the same category.…”
Section: Difficulty Measuresmentioning
confidence: 99%
“…The graphs were built from the adjacency matrix, based on an approximation method that uses an -NN (nearest neighbor) function. As in related work [46], is defined as 15% of the number of examples the dataset has. Finally, in our implementation, the percentage of points of high distance corresponds to the proportion of points with a distance greater than the mean distance plus its standard deviation.…”
Section: Meta-datamentioning
confidence: 99%
“…Some meta-features listed in Table 1 have already been used in previous work to characterize test instances in clustering problems [10,17,18,20,44,45,47]. We have also included some meta-features that have been used in classification problems that can be directly or easily adapted to characterize clustering problems [46]. As a contribution of this work, we have proposed using multivariate normality skewness and kurtosis measures [43] and a meta-feature group based on well-known network centrality measures [48,49].…”
Section: Meta-datamentioning
confidence: 99%
“…Complexity measures: Direct measures of the complexity of a problem such as the Kolmogorov measure are infeasible to compute in most cases, and though approximate metrics have been suggested, they are not useful for deep networks [6]. Various theoretical studies prove the bounds on the capacity of neural networks ranging from VC complexity to bounds for fully connected ReLU networks [4].…”
Section: Related Workmentioning
confidence: 99%