2022
DOI: 10.3389/fmed.2021.821120
|View full text |Cite
|
Sign up to set email alerts
|

MEDUSA: Multi-Scale Encoder-Decoder Self-Attention Deep Neural Network Architecture for Medical Image Analysis

Abstract: Medical image analysis continues to hold interesting challenges given the subtle characteristics of certain diseases and the significant overlap in appearance between diseases. In this study, we explore the concept of self-attention for tackling such subtleties in and between diseases. To this end, we introduce, a multi-scale encoder-decoder self-attention (MEDUSA) mechanism tailored for medical image analysis. While self-attention deep convolutional neural network architectures in existing literature center a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
8
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 12 publications
(8 citation statements)
references
References 36 publications
0
8
0
Order By: Relevance
“…This module accepts the following four different feature sizes output by the backbone network: ( , , 256), ( , , 512), ( , , 1024), and ( , , 2048). These features are weighted toward the target area through the multi-head self-attention mechanism of a traditional transformer, thereby focusing on the target feature information [34,35]. Each layer corresponded to the following four output feature sizes: ( , , 256), ( , , 512), ( , , 1024), and ( , , 2048).…”
Section: Bfamentioning
confidence: 99%
See 3 more Smart Citations
“…This module accepts the following four different feature sizes output by the backbone network: ( , , 256), ( , , 512), ( , , 1024), and ( , , 2048). These features are weighted toward the target area through the multi-head self-attention mechanism of a traditional transformer, thereby focusing on the target feature information [34,35]. Each layer corresponded to the following four output feature sizes: ( , , 256), ( , , 512), ( , , 1024), and ( , , 2048).…”
Section: Bfamentioning
confidence: 99%
“…This module accepts the following four different feature sizes output by the backbone network: ( ). These features are weighted toward the target area through the multi-head self-attention mechanism of a traditional transformer, thereby focusing on the target feature information [34,35] ). These features were up-sampled and fused through the corresponding GC layer.…”
Section: Bfamentioning
confidence: 99%
See 2 more Smart Citations
“…Then, the network utilized a self-attention mechanism to model non-local interactions and learn rich contextual information to detect complex-shaped lesion regions, which improved recognition efficiency. Aboutalebi et al [35] proposed a multi-scale encoder-decoder self-attention (MEDUSA) model to solve the problem of overlapping image appearance. The model improved the ability to model global remote spatial context by introducing self-attention modules, achieving good classification performance on several datasets.…”
Section: Hybrid Network Modelsmentioning
confidence: 99%