2021
DOI: 10.1002/mp.15312
|View full text |Cite
|
Sign up to set email alerts
|

Vision Transformer‐based recognition of diabetic retinopathy grade

Abstract: Background: In the domain of natural language processing, Transformers are recognized as state-of -the-art models, which opposing to typical convolutional neural networks (CNNs) do not rely on convolution layers. Instead, Transformers employ multi-head attention mechanisms as the main building block to capture long-range contextual relations between image pixels. Recently, CNNs dominated the deep learning solutions for diabetic retinopathy grade recognition. However, spurred by the advantages of Transformers, … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
22
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
9
1

Relationship

0
10

Authors

Journals

citations
Cited by 63 publications
(26 citation statements)
references
References 50 publications
0
22
0
Order By: Relevance
“…Wu et al. ( 26 ) successfully employed Vision Transformer ( 27 ) to recognize diabetic retinopathy grade with more accuracy than the CNN-based model. Yang et al.…”
Section: Introductionmentioning
confidence: 99%
“…Wu et al. ( 26 ) successfully employed Vision Transformer ( 27 ) to recognize diabetic retinopathy grade with more accuracy than the CNN-based model. Yang et al.…”
Section: Introductionmentioning
confidence: 99%
“…AS-OCT is, on the strength of its technical advancement, becoming an increasingly potent imaging modality for evaluation of various ocular diseases in the field of ophthalmology. In addition, the ViT, which is an extended, recent application of Transformer to computer vision inspired by its success in natural language processing, has attained excellent results in image classification and has shown its usefulness in ophthalmology as well [ 40 , 41 ]. Indeed, our pilot study demonstrated a promising potential of ViT for prediction of age from AS-OCT images.…”
Section: Discussionmentioning
confidence: 99%
“…Similarly, in diabetic retinopathy classification, Sun et al [101] proposed a lesion-aware Transformer architecture that jointly learns to detect the presence of diabetic retinopathy and the location of lesion discovery, using an encoder-decoder structure. Several authors have addressed the task of diabetic retinopathy recognition in a multi-class setting (i.e., no diabetic retinopathy, mild non-proliferative diabetic retinopathy, moderate diabetic retinopathy, severe non-proliferative diabetic retinopathy, and proliferative diabetic retinopathy) with a Vision Transformer architecture [102,103]. Yang et al [104] used a hybrid CNN-Transformer architecture to tackle ophthalmic image data in a multi-class setting (i.e., normal, diabetes, glaucoma, cataract, age-related macular degeneration, hypertension, myopia, and other abnormalities) using different data pre-processing strategies.…”
Section: ) Retinal Disease Classificationmentioning
confidence: 99%