2019 IEEE/CVF International Conference on Computer Vision (ICCV) 2019
DOI: 10.1109/iccv.2019.00718
|View full text |Cite
|
Sign up to set email alerts
|

TSM: Temporal Shift Module for Efficient Video Understanding

Abstract: The explosive growth in video streaming gives rise to challenges on performing video understanding at high accuracy and low computation cost. Conventional 2D CNNs are computationally cheap but cannot capture temporal relationships; 3D CNN based methods can achieve good performance but are computationally intensive, making it expensive to deploy. In this paper, we propose a generic and effective Temporal Shift Module (TSM) that enjoys both high efficiency and high performance. Specifically, it can achieve the p… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

3
1,273
0
1

Year Published

2019
2019
2022
2022

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 1,589 publications
(1,369 citation statements)
references
References 59 publications
3
1,273
0
1
Order By: Relevance
“…As shown in Table 2 Other SOTA Methods. We also compare our method with other recentlyproposed approaches, including TSM [20], STM [14], and ABM [40]. As Table 2 shows, STH with 8-frame input already outperforms ABM with (16 × 3)frame input.…”
Section: Comparison With Different Convolutions On Something-somethinmentioning
confidence: 94%
See 4 more Smart Citations
“…As shown in Table 2 Other SOTA Methods. We also compare our method with other recentlyproposed approaches, including TSM [20], STM [14], and ABM [40]. As Table 2 shows, STH with 8-frame input already outperforms ABM with (16 × 3)frame input.…”
Section: Comparison With Different Convolutions On Something-somethinmentioning
confidence: 94%
“…TRN [38] can learn temporal reasoning relationship from the features of the last layer, but its performance is still inferior to ours. The recently-proposed TSM [20] method performs better than other 2D based methods including TRN [38] and MFNet [18] as it has stronger temporal modeling ability across all levels. Compared to TSM, our proposed STH network achieves new state-of-the-art performance with 46.8% top-1 accuracy at T = 8 and 48.3% top-1 accuracy at T = 16, with even lower computational complexity.…”
Section: Comparison With Different Convolutions On Something-somethinmentioning
confidence: 97%
See 3 more Smart Citations