Linear Video Transformer with Feature Fixation

Lu, Kaiyue; Liu, Zexiang; Wang, Jianyuan; Sun, Weixuan; Qin, Z. H.; Liu, Dong; Shen, Xuyang; Deng, Hui; Han, Xiao; Dai, Yuchao; Zhong, Yiran

doi:10.48550/arxiv.2210.08164

Cited by 2 publications

(2 citation statements)

References 65 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Transformer (Vaswani et al 2017) is advancing steadily in the areas of natural language processing (Qin et al 2023b;Devlin et al 2019;Liu et al 2019;Qin et al 2022b,a;Liu et al 2022;Zhong 2023), computer vision (Dosovitskiy et al 2020;Sun et al 2022b;Lu et al 2022;Hao et al 2024), and audio processing (Gong, Chung, and Glass 2021; Akbari et al 2021;Gulati et al 2020;Sun et al 2022a). Although it outperforms other architectures such as RNNs (Cho et al 2014;Qin, Yang, and Zhong 2023) and CNNs (Kim 2014;Hershey et al 2016;Gehring et al 2017) in many sequence modeling tasks, its lack of length extrapolation capability limits its ability to handle a wide range of sequence lengths, i.e., inference sequences need to be equal to or shorter than training sequences.…”

Section: Introductionmentioning

confidence: 99%

Exploring Transformer Extrapolation

Qin,

Zhong,

Deng

2024

AAAI

View full text Add to dashboard Cite

Length extrapolation has attracted considerable attention recently since it allows transformers to be tested on longer sequences than those used in training. Previous research has shown that this property can be attained by using carefully designed Relative Positional Encodings (RPEs). While these methods perform well on a variety of corpora, the conditions for length extrapolation have yet to be investigated. This paper attempts to determine what types of RPEs allow for length extrapolation through a thorough mathematical and empirical analysis. We discover that a transformer is certain to possess this property as long as the series that corresponds to the RPE's exponential converges. Two practices are derived from the conditions and examined in language modeling tasks on a variety of corpora. As a bonus from the conditions, we derive a new Theoretical Receptive Field (TRF) to measure the receptive field of RPEs without taking any training steps. Extensive experiments are conducted on the Wikitext-103, Books, Github, and WikiBook datasets to demonstrate the viability of our discovered conditions. We also compare TRF to Empirical Receptive Field (ERF) across different models, showing consistently matched trends on these datasets. Code is released at: https://github.com/OpenNLPLab/Rpe.

show abstract

Section: Introductionmentioning

confidence: 99%

Exploring Transformer Extrapolation

Qin,

Zhong,

Deng

2024

AAAI

View full text Add to dashboard Cite

show abstract

“…Transformer [32] is advancing steadily in the areas of natural language processing [4,8,18,27,26,19], computer vision [9,2,31,21], and audio processing [12,1,13,30]. Although it outperforms other architectures such as RNNs [7] and CNNs [16,14,11] in many sequence modeling tasks, its lack of length extrapolation capability limits its ability to handle a wide range of sequence lengths, i.e., inference sequences need to be equal to or shorter than training sequences.…”

Section: Introductionmentioning

confidence: 99%

Search task oriented path planning method of airborne radar network

Qin

Lyu

et al. 2023

西北工业大学学报

View full text Add to dashboard Cite

Reasonable planning the trajectory of the airborne radar network can decrease its repeated surveillance region, and improve the search efficiency. To complete a wider range of reconnaissance searches within a certain time, a search task oriented path optimization method of airborne radar network is proposed. Firstly, the single-platform search range by combing the radar equation is characterized, and the multi-radar area coverage function based on the rasterization idea is calculated. A trajectory optimization model with the goal of maximizing the search coverage function is constructed, by combining with the motion constraints of each node and adopting an intelligent method. Simulation results show that the present method can obtain higher search coverage within specified time.

show abstract

Linear Video Transformer with Feature Fixation

Cited by 2 publications

References 65 publications

Exploring Transformer Extrapolation

Exploring Transformer Extrapolation

Search task oriented path planning method of airborne radar network

Contact Info

Product

Resources

About