2021
DOI: 10.48550/arxiv.2105.15078
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Can Attention Enable MLPs To Catch Up With CNNs?

Abstract: In the first week of May, 2021, researchers from four different institutions: Google, Tsinghua University, Oxford University and Facebook, shared their latest work [16,7,12,17] on arXiv.org almost at the same time, each proposing new learning architectures, consisting mainly of linear layers, claiming them to be comparable, or even superior to convolutional-based models. This sparked immediate discussion and debate in both academic and industrial communities as to whether MLPs are sufficient, many thinking tha… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(1 citation statement)
references
References 17 publications
0
1
0
Order By: Relevance
“…Several MLP-based architectures for computer vision that also operate on sequences of image patches have been recently proposed [7]. The aim of these architectures is to reduce the computational cost of ViT by removing the attention mechanism, while achieving a comparable performance by preserving a global receptive field similar to that of ViT.…”
Section: Attention-free Mlp-based Architecturesmentioning
confidence: 99%
“…Several MLP-based architectures for computer vision that also operate on sequences of image patches have been recently proposed [7]. The aim of these architectures is to reduce the computational cost of ViT by removing the attention mechanism, while achieving a comparable performance by preserving a global receptive field similar to that of ViT.…”
Section: Attention-free Mlp-based Architecturesmentioning
confidence: 99%