Class-imbalanced datasets are common across different domains such as health, banking, security and others. With such datasets, the learning algorithms are often biased toward the majority class-instances. Data Augmentation is a common approach that aims at rebalancing a dataset by injecting more data samples of the minority class instances. In this paper, a new data augmentation approach is proposed using a Generative Adversarial Networks (GAN) to handle the class imbalance problem. Unlike common GAN models, which use a single fake class, the proposed method uses multiple fake classes to ensure a fine-grained generation and classification of the minority class instances. Moreover, the proposed GAN model is conditioned to generate minority class instances aiming at rebalancing the dataset. Extensive experiments were carried out using public datasets, where synthetic samples generated using our model were added to the imbalanced dataset, followed by performing classification using Convolutional Neural Network. Experiment results show that our model can generate diverse minority class instances, even in extreme cases where the number of minority class instances is relatively low. Additionally, superior performance of our model over other common augmentation and oversampling methods was achieved in terms of classification accuracy and quality of the generated samples.
Engineering drawings are commonly used in different industries such as Oil and Gas, construction, and other types of engineering. Digitising these drawings is becoming increasingly important. This is mainly due to the need to improve business practices such as inventory, assets management, risk analysis, and other types of applications. However, processing and analysing these drawings is a challenging task. A typical diagram often contains a large number of different types of symbols belonging to various classes and with very little variation among them. Another key challenge is the class-imbalance problem, where some types of symbols largely dominate the data while others are hardly represented in the dataset. In this paper, we propose methods to handle these two challenges. First, we propose an advanced bounding-box detection method for localising and recognising symbols in engineering diagrams. Our method is end-to-end with no user interaction. Thorough experiments on a large collection of diagrams from an industrial partner proved that our methods accurately recognise more than 94% of the symbols. Secondly, we present a method based on Deep Generative Adversarial Neural Network for handling class-imbalance.The proposed GAN model proved to be capable of learning from a small number of training examples. Experiment results showed that the proposed method greatly improved the classification of symbols in engineering drawings.
Fine-grained image classification with a few-shot classifier is a highly challenging open problem at the core of a numerous data labeling applications. In this paper, we present Few-shot Classifier Generative Adversarial Network as an approach for few-shot classification. We address the problem of few-shot classification by designing a GAN in which the discriminator and the generator compete to output labeled data in any case. In contrast to previous methods, our techniques generate then classify images into multiple fake or real classes. A key innovation of our adversarial approach is to allow fine-grained classification using multiple fake classes with semi-supervised deep learning. A major strength of our techniques lies in its label-agnostic characteristic, in the sense that the system handles both labeled and unlabeled data during training. We validate quantitatively our few-shot classifier on the MNIST and SVHN datasets by varying the ratio of labeled data over unlabeled data in the training set. Our quantitative analysis demonstrates that our techniques produce better classification performance when using multiple fake classes and larger amount of unlabelled data.
Class-imbalanced datasets often contain one or more class that are under-represented in a dataset. In such a situation, learning algorithms are often biased toward the majority class instances. Therefore, some modification to the learning algorithm or the data itself is required before attempting a classification task. Data augmentation is one common approach used to improve the presence of the minority class instances and rebalance the dataset. However, simple augmentation techniques such as applying some affine transformation to the data, may not be sufficient in extreme cases, and often do not capture the variance present in the dataset. In this paper, we propose a new approach to generate more samples from minority class instances based on Generative Adversarial Neural Networks (GAN). We introduce a new Multiple Fake Class Generative Adversarial Networks (MFC-GAN) and generate additional samples to rebalance the dataset. We show that by introducing multiple fake class and oversampling, the model can generate the required minority samples. We evaluate our model on face generation task from attributes using a reduced number of samples in the minority class. Results obtained showed that MFC-GAN produces plausible minority samples that improve the classification performance compared with state-of-the-art AC-GAN generated samples.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.