2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019
DOI: 10.1109/cvpr.2019.01226
|View full text |Cite
|
Sign up to set email alerts
|

TACNet: Transition-Aware Context Network for Spatio-Temporal Action Detection

Abstract: Figure 1: Diagram of transitional state. There are some ambiguous states around but not belong to the target actions, and it is hard to distinguish them. We define the states as "transitional state" (red boxes). If we can effectively distinguish these states, we can improve the ability of temporal extent detection. AbstractCurrent state-of-the-art approaches for spatio-temporal action detection have achieved impressive results but remain unsatisfactory for temporal extent detection. The main reason comes from … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
30
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 60 publications
(30 citation statements)
references
References 25 publications
(41 reference statements)
0
30
0
Order By: Relevance
“…Their effect has been confirmed in other fields [25][26][27][28][29][30][31][32][33][34][35]39], such as object detection, anomaly detection and so on. Multi-scale modules [25][26][27][28][29] use different scale branches to collect complementary and different levels of ice sheet radar image features and merge them into multi-scale features to remedy the problem of poor feature extraction ability with a single-scale method. The attention modules [30][31][32][33][34] assign weights to different types of features from the global perspective to suppress noise, refine important ice boundary features and fit boundaries in radar topology sequences.…”
Section: Related Workmentioning
confidence: 92%
See 2 more Smart Citations
“…Their effect has been confirmed in other fields [25][26][27][28][29][30][31][32][33][34][35]39], such as object detection, anomaly detection and so on. Multi-scale modules [25][26][27][28][29] use different scale branches to collect complementary and different levels of ice sheet radar image features and merge them into multi-scale features to remedy the problem of poor feature extraction ability with a single-scale method. The attention modules [30][31][32][33][34] assign weights to different types of features from the global perspective to suppress noise, refine important ice boundary features and fit boundaries in radar topology sequences.…”
Section: Related Workmentioning
confidence: 92%
“…Therefore, to make up for the problems, the following two aspects are considered in our study. In order to further improve the representation of the features of ice layers adjacent regions, multi-scale features [25][26][27][28][29] can be used to extract more abundant scale features of ice layers. In addition, an attention mechanism can also be used to capture long-term relationships between ice layers and fuse the context information of radar images by paying attention to the features of the ice sheet radar images at different levels [30][31][32][33][34].…”
Section: Motivationmentioning
confidence: 99%
See 1 more Smart Citation
“…Compared the method two-stream and 3D convolution, two-stream is more accurate but less efficient than the latter. How to take the advantages of both of them better is a possible research direction in b) Action detection will extend from temporal action detection to spatio-temporal action detection [39]. That is to say, we should detect from one-dimensional temporal interval to two-dimensional spatio-temporal box that can detect actions more comprehensively.…”
Section: Future Directions and Trendsmentioning
confidence: 99%
“…They utilize the Sobel operator and element-wise subtraction to calculate the spatial and temporal gradients respectively. Song et al [26] propose Discriminative Motion Cue (DMC) to reduce noises in motion vectors and capture fine motion details. They train the DMC generator to approximate flow using a reconstruction loss and an adversarial loss, jointly with the downstream action classification task.…”
Section: B Spatiotemporal Two-streammentioning
confidence: 99%