2021
DOI: 10.48550/arxiv.2112.13890
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

SPViT: Enabling Faster Vision Transformers via Soft Token Pruning

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
5

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(7 citation statements)
references
References 53 publications
(62 reference statements)
0
4
0
Order By: Relevance
“…[53,15,30,69] propose different heuristics based on the attention weights to halt or aggregate tokens. [25] combines both token selection and aggregation. [79] proposes a slow-fast token update that applies token-wise transformations on the halted tokens and attention-based transformations on those that are not halted.…”
Section: Dynamic Transformermentioning
confidence: 99%
“…[53,15,30,69] propose different heuristics based on the attention weights to halt or aggregate tokens. [25] combines both token selection and aggregation. [79] proposes a slow-fast token update that applies token-wise transformations on the halted tokens and attention-based transformations on those that are not halted.…”
Section: Dynamic Transformermentioning
confidence: 99%
“…Hard pruning methods filter out some unimportant tokens according to a predefined scoring mechanism. DynamicViT [31], SPViT [21], and AdaViT [29] introduce additional prediction networks to score the tokens. Evo-ViT [45], ATS [16], and EViT [24] utilize the values of class tokens to evaluate the importance of tokens.…”
Section: Related Workmentioning
confidence: 99%
“…ToMe [2] merges similar tokens to reduce the length of the input sequence. The other approaches [10,12] prune tokens into a single token to reduce the length of an input sequence. However, our method processes multiple inputs at the same time, naturally reducing the computational cost.…”
Section: Related Workmentioning
confidence: 99%
“…These improvements, however, came at the cost of rapidly increasing computational burden, with the introduction of Transformer [4,7,20] marking a major milestone in this aspect. With the growing popularity of transformers, methods to reduce their computational costs have become a prominent research topic [1,2,10,12,17,22].…”
mentioning
confidence: 99%