2022
DOI: 10.48550/arxiv.2201.02973
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

MAXIM: Multi-Axis MLP for Image Processing

Abstract: Recent progress on Transformers and multi-layer perceptron (MLP) models provide new network architectural designs for computer vision tasks. Although these models proved to be effective in many vision tasks such as image recognition, there remain challenges in adapting them for low-level vision. The inflexibility to support high-resolution images and limitations of local attention are perhaps the main bottlenecks for using Transformers and MLPs in image restoration. In this work we present a multi-axis MLP bas… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
9
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
3
1

Relationship

2
6

Authors

Journals

citations
Cited by 8 publications
(9 citation statements)
references
References 72 publications
0
9
0
Order By: Relevance
“…Tu et al. 130 propose MAXIM, a UNet-shaped hierarchical structure that supports long-range interactions enabled by spatially gated MLPs. MAXIM contains two MLP-based building blocks: a multi-axis-gated MLP and a cross-gating block, both are variants of the gMLP block.…”
Section: Applications Of Mlp Variantsmentioning
confidence: 99%
“…Tu et al. 130 propose MAXIM, a UNet-shaped hierarchical structure that supports long-range interactions enabled by spatially gated MLPs. MAXIM contains two MLP-based building blocks: a multi-axis-gated MLP and a cross-gating block, both are variants of the gMLP block.…”
Section: Applications Of Mlp Variantsmentioning
confidence: 99%
“…Sequential vs. parallel. In our approach, we sequentially stack the multi-axis attention modules following [54,84], while there also exist other models that adopt a parallel design [81,98].…”
Section: Ablation Studiesmentioning
confidence: 99%
“…It also achieves promising results in restoration tasks [11,38,80,43,4,37,18,20,5,89,46,72]. In particular, for video restoration, Cao et al [4] propose the first transformer model for video SR, while Liang et al [37] propose an unified framework for video SR, deblurring and denoising.…”
Section: Vision Transformermentioning
confidence: 99%