2018
DOI: 10.1109/lra.2018.2792152
|View full text |Cite
|
Sign up to set email alerts
|

Re$^3$: Re al-Time Recurrent Regression Networks for Visual Tracking of Generic Objects

Abstract: Robust object tracking requires knowledge and understanding of the object being tracked: its appearance, its motion, and how it changes over time. A tracker must be able to modify its underlying model and adapt to new observations. We present Re 3 , a real-time deep object tracker capable of incorporating temporal information into its model. Rather than focusing on a limited set of objects or training a model at testtime to track a specific instance, we pretrain our generic tracker on a large variety of object… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
102
0

Year Published

2018
2018
2021
2021

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 139 publications
(102 citation statements)
references
References 38 publications
0
102
0
Order By: Relevance
“…Both the training and testing data involved 64 2D+t sequences; herein, the training size is 25, while the testing size is 39. The data had varying spatial (0.27-0.77 mm) and temporal resolution (11)(12)(13)(14)(15)(16)(17)(18)(19)(20)(21)(22)(23)(24)(25)(26)(27)(28)(29)(30). The training data were annotated by CLUST as ground truth (center of blood vessels) of fiducial features throughout the acquisition sequence.…”
Section: A Liver Ultrasound Data and Attention-aware Video Generationmentioning
confidence: 99%
See 2 more Smart Citations
“…Both the training and testing data involved 64 2D+t sequences; herein, the training size is 25, while the testing size is 39. The data had varying spatial (0.27-0.77 mm) and temporal resolution (11)(12)(13)(14)(15)(16)(17)(18)(19)(20)(21)(22)(23)(24)(25)(26)(27)(28)(29)(30). The training data were annotated by CLUST as ground truth (center of blood vessels) of fiducial features throughout the acquisition sequence.…”
Section: A Liver Ultrasound Data and Attention-aware Video Generationmentioning
confidence: 99%
“…Here, RNN efficiently handles temporal structure in sequences, while CNN handles spatial structures within single images. However, most of the approaches utilize convolutional and recurrent layers separately, followed by a fully connected layer, which may lead to a loss of spatial information …”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Our tracking agent maintains representations of both the policy π : S → A and the state value function v : S → R. This is done by using a DNN with parameters θ. In particular, we used a deep architecture that is similar to the one proposed by Gordon et al [10].…”
Section: Agent Architecturementioning
confidence: 99%
“…The proposed trackers are built on a deep regression network for tracking [13,10] and are trained inside an onpolicy Asynchronous Actor-Critic framework [32] that incorporates SL and expert demonstrations. A state-of-theart tracking algorithm [2] is run on a large-scale tracking dataset [18] to obtain the demonstrations.…”
Section: Introductionmentioning
confidence: 99%