2022
DOI: 10.48550/arxiv.2206.08898
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

SimA: Simple Softmax-free Attention for Vision Transformers

Abstract: Recently, vision transformers have become very popular. However, deploying them in many applications is computationally expensive partly due to the Softmax layer in the attention block. We introduce a simple but effective, Softmax-free attention block, SimA, which normalizes query and key matrices with simple 1 -norm instead of using Softmax layer. Then, the attention block in SimA is a simple multiplication of three matrices, so SimA can dynamically change the ordering of the computation at the test time to a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 39 publications
(52 reference statements)
0
1
0
Order By: Relevance
“…Further compounding the issue, the inclusion of the exponential function in softmax poses an overhead for hardware design. An earlier method presented in [23] employs softmax-free attention, leveraging the L1 norm for Q and K, effectively removing softmax. However, the L1 norm isn't constant and mandates online computations during inference.…”
Section: F Hardware Friendly Model Designmentioning
confidence: 99%
“…Further compounding the issue, the inclusion of the exponential function in softmax poses an overhead for hardware design. An earlier method presented in [23] employs softmax-free attention, leveraging the L1 norm for Q and K, effectively removing softmax. However, the L1 norm isn't constant and mandates online computations during inference.…”
Section: F Hardware Friendly Model Designmentioning
confidence: 99%