2020
DOI: 10.1007/978-3-030-58526-6_41
|View full text |Cite
|
Sign up to set email alerts
|

Feature Space Augmentation for Long-Tailed Data

Abstract: Real-world data often follow a long-tailed distribution as the frequency of each class is typically different. For example, a dataset can have a large number of under-represented classes and a few classes with more than sufficient data. However, a model to represent the dataset is usually expected to have reasonably homogeneous performances across classes. Introducing class-balanced loss and advanced methods on data re-sampling and augmentation are among the best practices to alleviate the data imbalance probl… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
101
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 159 publications
(126 citation statements)
references
References 42 publications
0
101
0
Order By: Relevance
“…3) According to equations (10) and (11), calculate the basic probability value of the network output, and use statistical analysis to obtain the basic probability value of other evidence. 4) Use equations (5)- (8) to calculate the confidence function, likelihood function, and conflict factor K of each focal element.…”
Section: )mentioning
confidence: 99%
See 2 more Smart Citations
“…3) According to equations (10) and (11), calculate the basic probability value of the network output, and use statistical analysis to obtain the basic probability value of other evidence. 4) Use equations (5)- (8) to calculate the confidence function, likelihood function, and conflict factor K of each focal element.…”
Section: )mentioning
confidence: 99%
“…For the same reason, Figure 2 shows that the AUC area of the three models XBGboost, Rand-omForest, and BPNN after arctangent processing increased by 0.0217 on average, and the AUC area stratification occurred: the AUC area of XBGboost is larger than RandomForest, and the AUC area of RandomForest is larger than BPNN. It shows that through arctangent transformation, the input data is mapped to [0, S /2], and then normalized processing can effectively solve the long-tailed [11] distribution problem of DGA data, so that the model can better express the nonlinear relationship between input and output .…”
Section: )mentioning
confidence: 99%
See 1 more Smart Citation
“…One common method is generating Gaussiandistribution-based noise directly or iteratively [5], and combine noise with data or features through linear combination [6,17]. Statisticbased methods statistics distribution of dataset to generate more controllable perturbations [2]. Gradient-based perturbation generation is another approach which based on the gradient of model's prediction loss, usually combined with gradient ascent method based on confused classes [8], adjusting method [13], and attacking methods like the Fast Gradient Sign Method (FGSM) and Project Gradient Descent (PGD) [19] [21].…”
Section: Related Work 21 Adversarial Training In Image Classificationmentioning
confidence: 99%
“…The main difference between existing adversarial training methods lies in their ways to generate and add perturbations. Commonlyused perturbation generation methods include stochastic normal distribution [17], statistical information [2], gradient [5,8,13,19,21], Generative Adversarial Networks (GAN) based methods [14]; while methods to introduce perturbations include data perturbation [5,8,13,14,19,21] and feature perturbation [2,17]. It is worth mentioning that existing methods usually make a trade-off between the model performance and robustness.…”
Section: Introductionmentioning
confidence: 99%