Re$^3$: Re al-Time Recurrent Regression Networks for Visual Tracking of Generic Objects

Gordon, Daniel; Farhadi, Ali; Fox, Dieter

doi:10.1109/lra.2018.2792152

Cited by 139 publications

(102 citation statements)

References 38 publications

Supporting

Mentioning

102

Contrasting

Order By: Relevance

“…Both the training and testing data involved 64 2D+t sequences; herein, the training size is 25, while the testing size is 39. The data had varying spatial (0.27-0.77 mm) and temporal resolution (11)(12)(13)(14)(15)(16)(17)(18)(19)(20)(21)(22)(23)(24)(25)(26)(27)(28)(29)(30). The training data were annotated by CLUST as ground truth (center of blood vessels) of fiducial features throughout the acquisition sequence.…”

Section: A Liver Ultrasound Data and Attention-aware Video Generationmentioning

confidence: 99%

“…Here, RNN efficiently handles temporal structure in sequences, while CNN handles spatial structures within single images. However, most of the approaches utilize convolutional and recurrent layers separately, followed by a fully connected layer, which may lead to a loss of spatial information …”

Section: Introductionmentioning

confidence: 99%

“…However, most of the approaches utilize convolutional and recurrent layers separately, followed by a fully connected layer, which may lead to a loss of spatial information. [18][19][20] The convolutional long short-term memory (CLSTM) RNN with forget gates and peephole connections is a modified version of vanilla RNN 21 and seems to be more suitable for the stated task of tumor tracking due to its track record for encoding long-range spatial information in sequential data and in circumventing the vanishing gradient problem. 22,23,24 CLSTM has convolutional structures in both the input-tostate and state-to-state transitions and has potential to model well the spatiotemporal relationships, while the traditional LSTM does not take spatial correlation into consideration.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Attention‐aware fully convolutional neural network with convolutional long short‐term memory network for ultrasound‐based motion tracking

Huang

Lü

et al. 2019

Medical Physics

View full text Add to dashboard Cite

Purpose One of the promising options for motion management in radiation therapy (RT) is the use of LINAC‐compatible robotic‐arm‐mounted ultrasound imaging system due to its high soft tissue contrast, real‐time capability, absence of ionizing radiation, and low cost. The purpose of this work is to develop a novel deep learning‐based real‐time motion tracking strategy for ultrasound image‐guided RT. Methods The proposed tracker combined the attention‐aware fully convolutional neural network (FCNN) and the convolutional long short‐term memory network (CLSTM) that is end‐to‐end trainable. The glimpse sensor module was built inside the attention‐aware FCNN to discard majority of background by focusing on a region containing the object of interest. FCNN extracted discriminating spatial features of glimpse to facilitate temporal modeling for CLSTM. The saliency mask computed from CLSTM refined the features particular to the tracked landmarks. Moreover, the multitask loss strategy including bounding box loss, localization loss, saliency loss, and adaptive loss weighting term was utilized to facilitate training convergence and avoid over/underfitting. The tracker was tested on the databases provided by MICCAI 2015 challenges, and the ground truth data were obtained with the help of brute force‐based template matching technology. Results The mean tracking error of 0.97 ± 0.52 mm and maximum tracking error of 1.94 mm were observed for 85 point landmarks across 39 ultrasound cases compared to the ground truth annotations. The tracking speed per frame per landmark with the GPU implementation ranged from 66 and 101 frames per second, which largely exceeded the ultrasound imaging rate. Conclusion The results demonstrated the robustness and accuracy of the proposed deep learning‐based motion estimation, despite the existence of some known shortcomings of ultrasound imaging such as speckle noise. The tracking speed of the system was found to be remarkable, sufficiently fast for real‐time applications in RT environment. The approach provides a valuable tool to guide RT treatment with beam gating or multileaf collimator (MLC) tracking in real time.

show abstract

Section: A Liver Ultrasound Data and Attention-aware Video Generationmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Attention‐aware fully convolutional neural network with convolutional long short‐term memory network for ultrasound‐based motion tracking

Huang

Lü

et al. 2019

Medical Physics

View full text Add to dashboard Cite

show abstract

“…Our tracking agent maintains representations of both the policy π : S → A and the state value function v : S → R. This is done by using a DNN with parameters θ. In particular, we used a deep architecture that is similar to the one proposed by Gordon et al [10].…”

Section: Agent Architecturementioning

confidence: 99%

“…The proposed trackers are built on a deep regression network for tracking [13,10] and are trained inside an onpolicy Asynchronous Actor-Critic framework [32] that incorporates SL and expert demonstrations. A state-of-theart tracking algorithm [2] is run on a large-scale tracking dataset [18] to obtain the demonstrations.…”

Section: Introductionmentioning

confidence: 99%

Visual Tracking by Means of Deep Reinforcement Learning and an Expert Demonstrator

Dunnhofer

Martinel

Foresti

et al. 2019

2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW)

View full text Add to dashboard Cite

In the last decade many different algorithms have been proposed to track a generic object in videos. Their execution on recent large-scale video datasets can produce a great amount of various tracking behaviours. New trends in Reinforcement Learning showed that demonstrations of an expert agent can be efficiently used to speed-up the process of policy learning. Taking inspiration from such works and from the recent applications of Reinforcement Learning to visual tracking, we propose two novel trackers, A3CT, which exploits demonstrations of a state-of-the-art tracker to learn an effective tracking policy, and A3CTD, that takes advantage of the same expert tracker to correct its behaviour during tracking. Through an extensive experimental validation on the GOT-10k, OTB-100, LaSOT, UAV123 and VOT benchmarks, we show that the proposed trackers achieve state-of-the-art performance while running in real-time. arXiv:1909.08487v1 [cs.CV] 18 Sep 2019 =

show abstract

Real‐time generic target tracking for structural displacement monitoring under environmental uncertainties via deep learning

Jeong

2021

Structural Contr & Hlth

View full text Add to dashboard Cite

Summary While structural displacement provides essential information about static and/or low‐frequency dynamic characteristics of structural behaviors, full‐scale measurement of absolute displacement in field structures is extremely challenging because of the requirement of fixed reference in most cases. Recent computer vision‐based sensing technologies have advanced to the level of reference‐free monitoring of full‐scale dynamic displacement using generic features of the structure. However, current generic feature‐based methods have limited to only short‐term or campaign‐type monitoring applications due to the intrinsic limitations of computer‐vision sensing under variable environmental conditions. This study investigates deep learning‐based approaches for real‐time computer‐vision sensing that enables displacement monitoring using generic features under harsh environmental uncertainties. Distractor‐Aware Siamese Region Proposal Network (DaSiamRPN) was employed to address the environmental uncertainty issues, particularly caused by luminous condition change and obstructed vision, without sacrificing real‐time processing capability. A series of indoor and outdoor experiments have been conducted to evaluate the performance under light condition change, occlusion, and haze. Comparative tests showed that the proposed method outperformed other various vision‐based object tracking methods, showing the feasibility for long‐term structural displacement monitoring of full‐scale structures.

show abstract

Re$^3$: Re al-Time Recurrent Regression Networks for Visual Tracking of Generic Objects

Cited by 139 publications

References 38 publications

Attention‐aware fully convolutional neural network with convolutional long short‐term memory network for ultrasound‐based motion tracking

Attention‐aware fully convolutional neural network with convolutional long short‐term memory network for ultrasound‐based motion tracking

Visual Tracking by Means of Deep Reinforcement Learning and an Expert Demonstrator

Real‐time generic target tracking for structural displacement monitoring under environmental uncertainties via deep learning

Contact Info

Product

Resources

About