2018
DOI: 10.1109/tcsvt.2017.2736553
|View full text |Cite
|
Sign up to set email alerts
|

T-CNN: Tubelets With Convolutional Neural Networks for Object Detection From Videos

Abstract: The state-of-the-art performance for object detection has been significantly improved over the past two years. Besides the introduction of powerful deep neural networks such as GoogleNet [1] and VGG [2], novel object detection frameworks such as R-CNN [3] and its successors, Fast R-CNN [4] and Faster R-CNN [5], play an essential role in improving the state-of-the-art. Despite their effectiveness on still images, those frameworks are not specifically designed for object detection from videos. Temporal and conte… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
287
0
2

Year Published

2018
2018
2021
2021

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 450 publications
(289 citation statements)
references
References 47 publications
0
287
0
2
Order By: Relevance
“…Several previous works devised various post-processing techniques applied to the results of still image detectors by leveraging temporal information: Kang et al [15,14] proposed to suppress false positive detections via multicontext suppression (MCS) and propagate predicted bounding boxes across frames using the motion calculated by optical flow. Then a temporal convolution neural network is trained to rescore the tubelets generated using visual tracking.…”
Section: Object Detection In Videosmentioning
confidence: 99%
See 1 more Smart Citation
“…Several previous works devised various post-processing techniques applied to the results of still image detectors by leveraging temporal information: Kang et al [15,14] proposed to suppress false positive detections via multicontext suppression (MCS) and propagate predicted bounding boxes across frames using the motion calculated by optical flow. Then a temporal convolution neural network is trained to rescore the tubelets generated using visual tracking.…”
Section: Object Detection In Videosmentioning
confidence: 99%
“…Another line of work [14] focuses on utilizing optical flow to extract motion information to facilitate object detection. However, such pre-computed optical flow is neither efficient nor task related.…”
Section: Object Detection In Videosmentioning
confidence: 99%
“…Linking single frame detections across the temporal dimension as done by T-CNN [13] constitutes possibly the simplest form of temporal domain exploration. T-CNN essentially runs region-based detectors per frame and enforces motion-based propagation to adjacent frames.…”
Section: Video Object Detectionmentioning
confidence: 99%
“…Therefore, the performance of object detection will affect almost all other computer vision research. A huge amount of effort has been put into its improvements [30,31,32].…”
Section: Introductionmentioning
confidence: 99%