A Deep CNN Transformer Hybrid Model for Skin Lesion Classification of Dermoscopic Images Using Focal Loss

Nie, Yali; Sommella, Paolo; Carratù, Marco; O’Nils, Mattias; Lundgren, Jan

doi:10.3390/diagnostics13010072

Cited by 14 publications

(5 citation statements)

References 50 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The CNN model cannot extract features with dryness, brightness, and shining images. The deep CNN with transformer model [ 38 ] uses all parts of the input image by dividing the input image into tokens and applying the transformers directly to the sequence of input images. The deep CNN with transformer model does not highlight the localized information, is unable to capture the contextual information, and is applicable for large datasets.…”

Section: Resultsmentioning

confidence: 99%

Type 1 and Type 2 Diabetes Measurement Using Human Face Skin Region

Euprazia,

Rajeswari,

Thyagharajan

et al. 2023

Journal of Diabetes Research

View full text Add to dashboard Cite

Aim. Analyse the diabetes mellitus (DM) of a person through the facial skin region using vision diabetology. Diabetes mellitus is caused by persistent high blood glucose levels and related complications, which show variation in facial skin regions due to reduced blood flow in the facial arteries. Materials and Method. In this study, 200 facial images of diabetes patients with skin conditions such as Bell’s palsy, rubeosis faciei, scleroderma, and vitiligo were collected from existing face videos. Moreover, face images are collected from diabetic persons in India. Viola Jones’ face-detecting algorithm extracts face skin regions from a diabetic person’s face image in video frames. The affected skin area on the diabetic person’s face is detected using HSV colour model segmentation. The proposed multiwavelet transform convolutional neural network (MWTCNN) extracts the features for diabetic measurement from up- and downfacial scaled images of diabetic persons. Results. The existing deep learning models are compared with the proposed MWTCNN model, which provides the highest accuracy of 98.3%. Conclusion. The facial skin region-based diabetic measurement avoids pricking of the serum and is used for continuous glucose monitoring.

show abstract

Section: Resultsmentioning

confidence: 99%

Type 1 and Type 2 Diabetes Measurement Using Human Face Skin Region

Euprazia,

Rajeswari,

Thyagharajan

et al. 2023

Journal of Diabetes Research

View full text Add to dashboard Cite

show abstract

“…Due to its ability to identify complex patterns in medical imaging data, DL has emerged as a promising tool for medical diagnosis, especially in the classification of skin diseases [10]. Utilizing various network architectures, such as Convolutional Neural Networks (CNN) [11], [12], an increasing body of research has been dedicated to applying DL techniques to classify skin diseases [13].…”

Section: Literature Reviewmentioning

confidence: 99%

An Integrated Multimodal Deep Learning Framework for Accurate Skin Disease Classification

Hamida,

Lamrani,

Bouqentar

et al. 2024

Int. J. Onl. Eng.

View full text Add to dashboard Cite

In order to effectively treat skin diseases, an accurate and prompt diagnosis is required. In this article, a novel method for classifying skin disorders using a multimodal classifier is presented. The proposed classifier utilizes multiple information sources to enhance the accuracy of disease classification. It incorporates images of skin lesions and patient-specific data. The multimodal classifier simultaneously classifies diseases by combining image and structured data inputs. The effectiveness of the proposed classifier was evaluated using the ISIC 2018 dataset, which includes images and clinical data for seven categories of skin diseases. The results indicate that the proposed model outperforms conventional single-modal and single-task classifiers, achieving an accuracy of 98.66% for image classification and 94.40% for clinical data classification. In addition, we compare the performance of the proposed model with that of other methodologies, demonstrating its superiority. Despite yielding promising results, the proposed method has limitations in terms of data requirements and generalizability. Future research directions include incorporating additional information sources, investigating genetic data integration, and applying the method to various medical conditions. This study illustrates the potential of integrating multimodal techniques with transfer learning in deep neural networks to enhance the classification accuracy of cutaneous diseases.

show abstract

“…However, CNN-based methods generally show limitations in modeling long-distance feature dependencies due to the local convolution operations and small receptive fields. To overcome this limitation, Vision Transformers have been developed and shown to improve feature extraction by building long-range feature interactions and capturing the global context of the features [6,7]. Particularly, Swin Transformer (SwinT) constructs a hierarchical representation of features by starting from small-sized patches and gradually merging neighboring patches in deeper Transformer layers [8].…”

Section: Introductionmentioning

confidence: 99%

A hybrid CNN-Swin Transformer network as deep learning model observer to predict human observer performance in 2AFC trial

Shao,

Mitra,

Byrd

et al. 2024

Medical Imaging 2024: Image Perception, Observer Performance, and Technology Assessment

View full text Add to dashboard Cite

Model observers designed to predict human observers in detection tasks are important tools for assessing task-based image quality and optimizing imaging systems, protocol, and reconstruction algorithms. Linear model observers have been widely studied to predict human detection performance, and recently, deep learning model observers (DLMOs) have been developed to improve the prediction accuracy. Most existing DLMOs utilize convolutional neural network (CNN) architectures, which are capable of learning local features while not good at extracting long-distance relations in images. To further improve the performance of CNN-based DLMOs, we investigate a hybrid CNN-Swin Transformer (CNN-SwinT) network as DLMO for PET lesion detection. The hybrid network combines CNN and SwinT encoders, which can capture both local information and global context. We trained the hybrid network on the responses of 8 human observers including 4 radiologists in a two-alternative forced choice (2AFC) experiment with PET images generated by adding simulated lesions to clinical data. We conducted a 9-fold cross-validation experiment to evaluate the proposed hybrid DLMO, compared to conventional linear model observers such as a channelized Hotelling observer (CHO) and a non-prewhitening matched filter (NPWMF). The hybrid CNN-SwinT DLMO predicted human observer responses more accurately than the linear model observers and DLMO with only the CNN encoder. This work demonstrates that the proposed hybrid CNN-SwinT DLMO has the potential as an improved tool for task-based image quality assessment.

show abstract

A Deep CNN Transformer Hybrid Model for Skin Lesion Classification of Dermoscopic Images Using Focal Loss

Cited by 14 publications

References 50 publications

Type 1 and Type 2 Diabetes Measurement Using Human Face Skin Region

Type 1 and Type 2 Diabetes Measurement Using Human Face Skin Region

An Integrated Multimodal Deep Learning Framework for Accurate Skin Disease Classification

A hybrid CNN-Swin Transformer network as deep learning model observer to predict human observer performance in 2AFC trial

Contact Info

Product

Resources

About