2024
DOI: 10.1007/s40747-024-01451-x
|View full text |Cite
|
Sign up to set email alerts
|

Sla-former: conformer using shifted linear attention for audio-visual speech recognition

Yewei Xiao,
Jian Huang,
Xuanming Liu
et al.

Abstract: Conformer-based models have proven highly effective in Audio-visual Speech Recognition, integrating auditory and visual inputs to significantly enhance speech recognition accuracy. However, the widely utilized softmax attention mechanism within conformer models encounters scalability issues, with its spatial and temporal complexity escalating quadratically with sequence length. To address these challenges, this paper introduces the Shifted Linear Attention Conformer, an evolved iteration of the conformer archi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Publication Types

Select...

Relationship

0
0

Authors

Journals

citations
Cited by 0 publications
references
References 65 publications
0
0
0
Order By: Relevance

No citations

Set email alert for when this publication receives citations?