Visual Tracking by Means of Deep Reinforcement Learning and an Expert Demonstrator

Dunnhofer, Matteo; Martinel, Niki; Foresti, Gian Luca; Micheloni, Christian

doi:10.1109/iccvw.2019.00282

Cited by 37 publications

(18 citation statements)

References 45 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Tracker. We follow the most recent advancements in deep regression tracking [7], [34], [9] to implement our tracker s(•|θ) as a deep neural network with weights θ. The network gets as input s t as two image patches which pass through two ResNet-18 [35] CNN branches with shared weights.…”

Section: Methodsmentioning

confidence: 99%

Weakly-Supervised Domain Adaptation of Deep Regression Trackers via Reinforced Knowledge Distillation

Dunnhofer,

Martinel,

Micheloni

2021

Preprint

Self Cite

View full text Add to dashboard Cite

Deep regression trackers are among the fastest tracking algorithms available, and therefore suitable for realtime robotic applications. However, their accuracy is inadequate in many domains due to distribution shift and overfitting. In this paper we overcome such limitations by presenting the first methodology for domain adaption of such a class of trackers.To reduce the labeling effort we propose a weakly-supervised adaptation strategy, in which reinforcement learning is used to express weak supervision as a scalar application-dependent and temporally-delayed feedback. At the same time, knowledge distillation is employed to guarantee learning stability and to compress and transfer knowledge from more powerful but slower trackers. Extensive experiments on five different robotic vision domains demonstrate the relevance of our methodology. Realtime speed is achieved on embedded devices and on machines without GPUs, while accuracy reaches significant results.

show abstract

Section: Methodsmentioning

confidence: 99%

Weakly-Supervised Domain Adaptation of Deep Regression Trackers via Reinforced Knowledge Distillation

Dunnhofer,

Martinel,

Micheloni

2021

Preprint

Self Cite

View full text Add to dashboard Cite

show abstract

“…An improvement to [45] for SOT was proposed by [84], where a visual tracker was formulated using DRL and an expert demonstrator. The authors treated the problem as an MDP, where the state consists of two consecutive frames that have been cropped using the bounding box corresponding to the former frame and used a scaling factor to control the offset while cropping.…”

Section: Drl In Object Trackingmentioning

confidence: 99%

“…The figure illustrates a general implementation of object tracking in videos using DRL, where the state consists of two consecutive frames (F t , F t+1 ) with a bounding box for the first frame produced by another algorithm for the first iteration or by the previous iterations of DRL agent. The actions corresponds to the moving the bounding on the image to fit the object in frame F t+1 , hence forming a new state with frame F t+1 and frame F t+2 along with the bounding box for frame F t+1 predicted by previous iteration and reward corresponds to whether IOU is greater then a given threshold as used by [118], [308], [45], [84], [307], [168], [169].…”

Section: Objectmentioning

confidence: 99%

Deep Reinforcement Learning in Computer Vision: A Comprehensive Survey

Le¹,

Rathour²,

Yamazaki³

et al. 2021

Preprint

View full text Add to dashboard Cite

Deep reinforcement learning augments the reinforcement learning framework and utilizes the powerful representation of deep neural networks. Recent works have demonstrated the remarkable successes of deep reinforcement learning in various domains including finance, medicine, healthcare, video games, robotics, and computer vision. In this work, we provide a detailed review of recent and state-of-the-art research advances of deep reinforcement learning in computer vision. We start with comprehending the theories of deep learning, reinforcement learning, and deep reinforcement learning. We then propose a categorization of deep reinforcement learning methodologies and discuss their advantages and limitations. In particular, we divide deep reinforcement learning into seven main categories according to their applications in computer vision, i.e. (i) landmark localization (ii) object detection; (iii) object tracking; (iv) registration on both 2D image and 3D image volumetric data (v) image segmentation; (vi) videos analysis; and (vii) other applications. Each of these categories is further analyzed with reinforcement learning techniques, network design, and performance. Moreover, we provide a comprehensive analysis of the existing publicly available datasets and examine source code availability. Finally, we present some open issues and discuss future research directions on deep reinforcement learning in computer vision.

show abstract

“…< l a t e x i t s h a 1 _ b a s e 6 4 = " 0 q a i s a C W N S A d B x M e B n 5 Athlete Tracking. The first step of the pipeline is to ploit a visual object tracker [9,10] to track the motion of the athlete across all the frames up to F t . We used a tracker outputting a bounding-box b t = (x The number in the top-left corner of each image reports the frame index t in the video.…”

Section: Pipelinementioning

confidence: 99%

Video-Based Reconstruction of the Trajectories Performed by Skiers

Dunnhofer¹,

Zurini²,

Dunnhofer³

et al. 2021

Preprint

Self Cite

View full text Add to dashboard Cite

Trajectories are fundamental in different skiing disciplines. Tools enabling the analysis of such curves can enhance the training activity and enrich the broadcasting contents. However, the solutions currently available are based on geo-localized sensors and surface models. In this short paper, we propose a video-based approach to reconstruct the sequence of points traversed by an athlete during its performance. Our prototype is constituted by a pipeline of deep learning-based algorithms to reconstruct the athlete's motion and to visualize it according to the camera perspective. This is achieved for different skiing disciplines in the wild without any camera calibration. We tested our solution on broadcast and smartphone-captured videos of alpine skiing and ski jumping professional competitions. The qualitative results achieved show the potential of our solution.

show abstract

Visual Tracking by Means of Deep Reinforcement Learning and an Expert Demonstrator

Cited by 37 publications

References 45 publications

Weakly-Supervised Domain Adaptation of Deep Regression Trackers via Reinforced Knowledge Distillation

Weakly-Supervised Domain Adaptation of Deep Regression Trackers via Reinforced Knowledge Distillation

Deep Reinforcement Learning in Computer Vision: A Comprehensive Survey

Video-Based Reconstruction of the Trajectories Performed by Skiers

Contact Info

Product

Resources

About