Multi-Label Fundus Image Classification Using Attention Mechanisms and Feature Fusion

Li, Zhenwei; Xu, Min; Yang, Xiaoli; Han, Yanqi

doi:10.3390/mi13060947

Cited by 15 publications

(3 citation statements)

References 29 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In extreme cases, if the identity mapping is optimal, the network can use an easier way to construct the identity mapping, pushing the residuals F(x) = H(x) − x to zero, which is easier than fitting the identity mapping with multiple non-linear layers. Therefore, the gradient flows directly through these connections, reducing the disappearance or explosion of the gradient, and the training of the deep learning network becomes easier [ 36 , 37 , 38 ].…”

Section: Construction Of the Residual Network Modelmentioning

confidence: 99%

Hardness-and-Type Recognition of Different Objects Based on a Novel Porous Graphene Flexible Tactile Sensor Array

Yang

Wang

et al. 2023

Micromachines

View full text Add to dashboard Cite

Accurately recognizing the hardness and type of different objects by tactile sensors is of great significance in human–machine interaction. In this paper, a novel porous graphene flexible tactile sensor array with great performance is designed and fabricated, and it is mounted on a two-finger mechanical actuator. This is used to detect various tactile sequence features from different objects by slightly squeezing them by 2 mm. A Residual Network (ResNet) model, with excellent adaptivity and feature extraction ability, is constructed to realize the recognition of 4 hardness categories and 12 object types, based on the tactile time sequence signals collected by the novel sensor array; the average accuracies of hardness and type recognition are 100% and 99.7%, respectively. To further verify the classification ability of the ResNet model for the tactile feature information detected by the sensor array, the Multilayer Perceptron (MLP), LeNet, Multi-Channel Deep Convolutional Neural Network (MCDCNN), and ENCODER models are built based on the same dataset used for the ResNet model. The average recognition accuracies of the 4hardness categories, based on those four models, are 93.6%, 98.3%, 93.3%, and 98.1%. Meanwhile, the average recognition accuracies of the 12 object types, based on the four models, are 94.7%, 98.9%, 85.0%, and 96.4%. All of the results demonstrate that the novel porous graphene tactile sensor array has excellent perceptual performance and the ResNet model can very effectively and precisely complete the hardness and type recognition of objects for the flexible tactile sensor array.

show abstract

Section: Construction Of the Residual Network Modelmentioning

confidence: 99%

Hardness-and-Type Recognition of Different Objects Based on a Novel Porous Graphene Flexible Tactile Sensor Array

Yang

Wang

et al. 2023

Micromachines

View full text Add to dashboard Cite

show abstract

“…Although the existing methods have achieved good results in extracting fundus lesion features [ 27 ], the data volume still affects the classification performance of the network, and the classification effect of the network cannot be visually analyzed. Different from the above methods, this paper proposes a data enhancement method guided by Grad-CAM visual attention based on the integrated neural network, which amplifies the fundus image dataset in a targeted manner, helps the model learn rich subtle features, and improves recognition accuracy.…”

Section: Related Workmentioning

confidence: 99%

A Multi-Label Detection Deep Learning Model with Attention-Guided Image Enhancement for Retinal Images

Yang

et al. 2023

Micromachines

Self Cite

View full text Add to dashboard Cite

At present, multi-disease fundus image classification tasks still have the problems of small data volumes, uneven distributions, and low classification accuracy. In order to solve the problem of large data demand of deep learning models, a multi-disease fundus image classification ensemble model based on gradient-weighted class activation mapping (Grad-CAM) is proposed. The model uses VGG19 and ResNet50 as the classification networks. Grad-CAM is a data augmentation module used to obtain a network convolutional layer output activation map. Both the augmented and the original data are used as the input of the model to achieve the classification goal. The data augmentation module can guide the model to learn the feature differences of lesions in the fundus and enhance the robustness of the classification model. Model fine tuning and transfer learning are used to improve the accuracy of multiple classifiers. The proposed method is based on the RFMiD (Retinal Fundus Multi-Disease Image Dataset) dataset, and an ablation experiment was performed. Compared with other methods, the accuracy, precision, and recall of this model are 97%, 92%, and 81%, respectively. The resulting activation graph shows the areas of interest for model classification, making it easier to understand the classification network.

show abstract

“…We demonstrated in this paper that the right choice of filter significantly impacts the accuracy of the training model. Table 4 compares the accuracy obtained from our method with the results obtained in [17], [28], [29]. These papers investigate the multiclassification problem using the same ODIR dataset we chose for our experiments.…”

Section: Recall Tp Tp Fnmentioning

confidence: 99%

Classification of Ocular Diseases Related to Diabetes Using Transfer Learning

Sbai¹,

Oukhouya

Touil

2023

Int. J. Onl. Eng.

View full text Add to dashboard Cite

Although artificial intelligence enables the detection of abnormalities in medical images and is widely used as a computer vision technology, many researchers have focused on the detection of only one disease related to diabetes, which is diabetic retinopathy. In fact, patients face a significant risk of two other illnesses: cataract and glaucoma. In this article, we examined the diagnosis of these three eye diseases caused by diabetes and compared four approaches to classify these conditions. The proposed approaches are based on the transfer learning technique. We started by filtering, preparing, and augmenting the dataset, then applied transfer learning for feature extraction using two different architectures: VGG16 and RESNET50. We also investigated the impact of using contrast limited adaptive histogram equalization on the accuracy and precision of the models. This filter was used in a pre-training step for diabetic retinopathy diagnosis and in this paper proved its efficiency for glaucoma and cataract too. The final layers were replaced by Random Forest for classification. Models performed acceptable accuracies of 89.17% and 85.64% without operating contrast-limited adaptive histogram equalization and achieved better results when applying contrast-limited adaptive histogram equalization, with an accuracy of 97.48% and 96.66% for VGG16 and RESNET 50, respectively.

show abstract

Multi-Label Fundus Image Classification Using Attention Mechanisms and Feature Fusion

Cited by 15 publications

References 29 publications

Hardness-and-Type Recognition of Different Objects Based on a Novel Porous Graphene Flexible Tactile Sensor Array

Hardness-and-Type Recognition of Different Objects Based on a Novel Porous Graphene Flexible Tactile Sensor Array

A Multi-Label Detection Deep Learning Model with Attention-Guided Image Enhancement for Retinal Images

Classification of Ocular Diseases Related to Diabetes Using Transfer Learning

Contact Info

Product

Resources

About