Askhat Sanbaev scite author profile

Askhat Sanbaev

3Publications

1Citation Statement Received

0Citation Statements Given

How they've been cited

How they cite others

Affiliations

Publications

Order By: Most citations

Sensitivity of Modern Deep Learning Neural Networks to Unbalanced Datasets in Multiclass Classification Problems

et al. 2023

View full text Add to dashboard Cite

One of the critical problems in multiclass classification tasks is the imbalance of the dataset. This is especially true when using contemporary pre-trained neural networks, where the last layers of the neural network are retrained. Therefore, large datasets with highly unbalanced classes are not good for models’ training since the use of such a dataset leads to overfitting and, accordingly, poor metrics on test and validation datasets. In this paper, the sensitivity to a dataset imbalance of Xception, ViT-384, ViT-224, VGG19, ResNet34, ResNet50, ResNet101, Inception_v3, DenseNet201, DenseNet161, DeIT was studied using a highly imbalanced dataset of 20,971 images sorted into 7 classes. It is shown that the best metrics were obtained when using a cropped dataset with augmentation of missing images in classes up to 15% of the initial number. So, the metrics can be increased by 2–6% compared to the metrics of the models on the initial unbalanced data set. Moreover, the metrics of the rare classes’ classification also improved significantly–the True Positive value can be increased by 0.3 or more. As a result, the best approach to train considered networks on an initially unbalanced dataset was formulated.

show abstract

Deep Learning Approaches to Automatic Chronic Venous Disease Classification

Barulina

Sanbaev²,

Okunkov

et al. 2022

Mathematics

View full text Add to dashboard Cite

Chronic venous disease (CVD) occurs in a substantial proportion of the world’s population. If the onset of CVD looks like a cosmetic defect, over time, it might be transformed into serious problems that will require surgical intervention. The aim of this work is to use deep learning (DL) methods for automatic classification of the stage of CVD for self-diagnosis of a patient by using the image of the patient’s legs. The images of legs with CVD required for DL algorithms were collected from open Internet resources using the developed algorithms. For image preprocessing, the binary classification problem “legs–no legs” was solved based on Resnet50 with accuracy of 0.998. The application of this filter made it possible to collect a dataset of 11,118 good-quality leg images with various stages of CVD. For classification of various stages of CVD according to the CEAP classification, the multi-classification problem was set and resolved by using four neural networks with completely different architectures: Resnet50 and transformers such as data-efficient image transformers (DeiT) and a custom vision transformer (vit-base-patch16-224 and vit-base-patch16-384). The model based on DeiT without any tuning showed better results than the model based on Resnet50 did (precision = 0.770 (DeiT) and 0.615 (Resnet50)). vit-base-patch16-384 showed the best results (precision = 0.79). To demonstrate the results of the work, a Telegram bot was developed, in which fully functioning DL algorithms were implemented. This bot allowed evaluating the condition of the patient’s legs with fairly good accuracy of CVD classification.

show abstract

Sensitivity of Modern Deep Learning Neural Networks to Unbalanced Datasets in Multiclass Classification Problems

Barulina¹,

Okunkov²,

Ulitin³

et al. 2023

Preprint

View full text Add to dashboard Cite

One of the critical problems in multiclass classification tasks is the imbalance of the dataset. This is especially true when using contemporary pre-trained neural networks, where, in fact, the last layers of the neural network are retrained. Therefore, the large datasets with highly unbalanced classes are not good for models’ training since the use of such a dataset leads to overfitting and, accordingly, poor metrics on test and validation datasets. In this paper the sensitivity to a dataset imbalance of Xception, ViT-384, ViT-224, VGG19, ResNet34, ResNet50, ResNet101, Inception_v3, DenseNet201, DenseNet161, DeIT was studied using a highly imbalanced dataset of 20,971 images sorted into 7 classes. It is shown that the best metrics were obtained when using a cropped dataset with augmentation of missing images in classes up to 15% of the initial number. So, the metrics can be increased by 2-6% compared to the metrics of the models on the initial unbalanced data set. Moreover, the metrics of the rare classes' classification also improved significantly – the TruePositive value can be increased by 0.3 and more. As result, the best approach to train considered networks on an initially unbalanced dataset was formulated.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Askhat Sanbaev

Sensitivity of Modern Deep Learning Neural Networks to Unbalanced Datasets in Multiclass Classification Problems

Deep Learning Approaches to Automatic Chronic Venous Disease Classification

Sensitivity of Modern Deep Learning Neural Networks to Unbalanced Datasets in Multiclass Classification Problems

Contact Info

Product

Resources

About