2022
DOI: 10.1007/s10489-022-03793-w
|View full text |Cite
|
Sign up to set email alerts
|

Hostility measure for multi-level study of data complexity

Abstract: Complexity measures aim to characterize the underlying complexity of supervised data. These measures tackle factors hindering the performance of Machine Learning (ML) classifiers like overlap, density, linearity, etc. The state-of-the-art has mainly focused on the dataset perspective of complexity, i.e., offering an estimation of the complexity of the whole dataset. Recently, the instance perspective has also been addressed. In this paper, the hostility measure, a complexity measure offering a multi-level (ins… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2

Citation Types

0
3
0

Year Published

2023
2023
2025
2025

Publication Types

Select...
5
1
1

Relationship

1
6

Authors

Journals

citations
Cited by 7 publications
(4 citation statements)
references
References 36 publications
0
3
0
Order By: Relevance
“…In addition, some of the original complexity measures have been adapted to the instance level [3]. Recently, a multi-level analysis of data complexity has been addressed, covering the instance, the class, and the dataset level with a new proposed complexity measure called hostility measure [17].…”
Section: State-of-the-artmentioning
confidence: 99%
See 2 more Smart Citations
“…In addition, some of the original complexity measures have been adapted to the instance level [3]. Recently, a multi-level analysis of data complexity has been addressed, covering the instance, the class, and the dataset level with a new proposed complexity measure called hostility measure [17].…”
Section: State-of-the-artmentioning
confidence: 99%
“…However, the instance perspective of data complexity has fostered their use in tasks related to IS like noise filter or data sampling. For example, in [17], they filter a 10% and a 50% of the most complex points, reducing the error. In [34], data complexity is employed for curriculum learning.…”
Section: State-of-the-artmentioning
confidence: 99%
See 1 more Smart Citation
“…Hance, improving the dataset characteristics is crucial to enhance the performance of classification. These characteristics can include overlapping classes, linearity of bound decisions, and imbalance ratio in the dataset (Lancho et al 2023). Ho and Basu (2002) introduced a measurement to assess the dataset characteristics by examining the geometrical distribution of data.…”
Section: Introductionmentioning
confidence: 99%