Learned Token Pruning for Transformers

Kim, Sehoon; Shen, Sheng; Thorsley, David; Gholami, Amir; Hassoun, Joseph; Keutzer, Kurt

doi:10.1145/3534678.3539260

Cited by 36 publications

(19 citation statements)

References 4 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Pruning methods can also differ in the way that token reduction is applied. In fixed rate pruning (Goyal et al, 2020;Rao et al, 2021;Bolya et al, 2023;Liang et al, 2022;Xu et al, 2022) a predefined number of tokens is removed per layer, while in adaptive approaches (Kim et al, 2022;Yin et al, 2021; the tokens are pruned dynamically based on the input.…”

Section: Token Pruningmentioning

confidence: 99%

“…Our learned thresholds approach is conceptually similar to learned token pruning as introduced in Kim et al (2022). In each transformer block an importance score is calculated for every token x i , i ∈ {1, ..., n}, where n = hw is the number of tokens 1 .…”

Section: Learned Thresholds Pruningmentioning

confidence: 99%

“…The learned token pruning implementation in Kim et al (2022) multiplies its mask with the tokens in order to create zero valued tokens. However, these tokens do not remain zero due to bias terms in MLP layers; furthermore adding zero valued tokens changes attention calculations compared to removing those tokens.…”

Section: Learned Thresholds Pruningmentioning

confidence: 99%

“…which can be interpreted as the normalized amount that all tokens x k attend to token x i (Kim et al, 2022).…”

Section: Importance Scorementioning

confidence: 99%

See 3 more Smart Citations

Anchor pruning for object detection

Bonnaerens

Freiberger

Dambre

2022

Computer Vision and Image Understanding

View full text Add to dashboard Cite

Section: Token Pruningmentioning

confidence: 99%

Section: Learned Thresholds Pruningmentioning

confidence: 99%

Section: Learned Thresholds Pruningmentioning

confidence: 99%

“…which can be interpreted as the normalized amount that all tokens x k attend to token x i (Kim et al, 2022).…”

Section: Importance Scorementioning

confidence: 99%

See 2 more Smart Citations

Anchor pruning for object detection

Bonnaerens

Freiberger

Dambre

2022

Computer Vision and Image Understanding

View full text Add to dashboard Cite

“…Tang et al [28] presents a top-down layer by layer patch slimming algorithm to reduce the computational cost in pre-trained Vision Transformers. The core strategy of these algorithms and other similar works [11,13,19] is to abandon redundant tokens to reduce the computational complexity of the model.…”

Section: Related Workmentioning

confidence: 99%

Multilingual Speech Translation with Unified Transformer: Huawei Noah’s Ark Lab at IWSLT 2021

Zeng¹,

Li²,

Liu³

2021

Proceedings of the 18th International Conference on Spoken Language Translation (IWSLT 2021)

View full text Add to dashboard Cite

This paper describes the system submitted to the IWSLT 2021 Multilingual Speech Translation (MultiST) task from Huawei Noah's Ark Lab. We use a unified transformer architecture for our MultiST model, so that the data from different modalities (i.e., speech and text) and different tasks (i.e., Speech Recognition, Machine Translation, and Speech Translation) can be exploited to enhance the model's ability. Specifically, speech and text inputs are firstly fed to different feature extractors to extract acoustic and textual features, respectively. Then, these features are processed by a shared encoder-decoder architecture. We apply several training techniques to improve the performance, including multi-task learning, task-level curriculum learning, data augmentation, etc. Our final system achieves significantly better results than bilingual baselines on supervised language pairs and yields reasonable results on zero-shot language pairs.

show abstract

ABP: Asymmetric Bilateral Prompting for Text-Guided Medical Image Segmentation

Zeng,

Cui

et al. 2024

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Learned Token Pruning for Transformers

Cited by 36 publications

References 4 publications

Anchor pruning for object detection

Anchor pruning for object detection

Multilingual Speech Translation with Unified Transformer: Huawei Noah’s Ark Lab at IWSLT 2021

ABP: Asymmetric Bilateral Prompting for Text-Guided Medical Image Segmentation

Contact Info

Product

Resources

About