Learning Discriminative Model Prediction for Tracking

Bhat, Goutam; Danelljan, Martin; Gool, Luc Van; Timofte, Radu

doi:10.48550/arxiv.1904.07220

Cited by 5 publications

(18 citation statements)

References 0 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Generative trackers [45,3] base on the matching results of the features following a non-parametric nearest-neighbor methodology, while discriminative trackers with either tracking-by-detection framework [49,36] or correlation filter [18,10] resort to an online updated parametric classifier. A related study [58] shows that generative trackers prevail given its generative embedding space crucial for high-fidelity representation [26,51], whilst discriminative trackers [6,9,4] exploit the background information in context to learn a discriminant model thus perform well at suppressing the distractors. Cascaded Framework for Tracking.…”

Section: Related Workmentioning

confidence: 99%

“…Though with simple classification rules, these non-parametric models exclude a mechanism for feature selection thus are not robust to noisy features. On the contrary, optimization-based model [4,38,47,19] uses an explicit gradient-descent algorithm to adjust the parameters of the model given an online sampled dataset. Model-based model [53,27,12] learns a parameterized predictor to estimate model parameters by implicitly leveraging gradient or latent distribution as meta information.…”

Section: Related Workmentioning

confidence: 99%

“…The majority of these methods utilize the gradient under certain objectives for the online model update. Some approaches [53,27,4] are optimized under L2 or hinge loss thus form a ridge regression problem where the decision boundary is linear. However, ridge regression is not robust to outliers and unable to pick informative or decisive hard examples.…”

Section: Related Workmentioning

confidence: 99%

“…The concept of FSL, however, is not new and has been introduced by several previous works [38,19,47,53,4]. By involving the online updating into the offline training stage as an inner loop, these methods turn the manual design of the online update strategy [18,36] into a data-driven module.…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Real-Time Visual Object Tracking via Few-Shot Learning

Zhou,

Li,

Wang

et al. 2021

Preprint

View full text Add to dashboard Cite

Visual Object Tracking (VOT) can be seen as an extended task of Few-Shot Learning (FSL). While the concept of FSL is not new in tracking and has been previously applied by prior works, most of them are tailored to fit specific types of FSL algorithms and may sacrifice running speed. In this work, we propose a generalized two-stage framework that is capable of employing a large variety of FSL algorithms while presenting faster adaptation speed. The first stage uses a Siamese Regional Proposal Network to efficiently propose the potential candidates and the second stage reformulates the task of classifying these candidates to a few-shot classification problem. Following such a coarse-to-fine pipeline, the first stage proposes informative sparse samples for the second stage, where a large variety of FSL algorithms can be conducted more conveniently and efficiently. As substantiation of the second stage, we systematically investigate several forms of optimization-based few-shot learners from previous works with different objective functions, optimization methods, or solution space. Beyond that, our framework also entails a direct application of the majority of other FSL algorithms to visual tracking, enabling mutual communication between researchers on these two topics. Extensive experiments on the major benchmarks, VOT2018, OTB2015, NFS, UAV123, Track-ingNet, and GOT-10k are conducted, demonstrating in desirable performance gain and a real-time speed.

show abstract

Section: Related Workmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Real-Time Visual Object Tracking via Few-Shot Learning

Zhou,

Li,

Wang

et al. 2021

Preprint

View full text Add to dashboard Cite

show abstract

“…These approaches learn an online model of the object's appearance using hand-crafted features or deep features pre-trained for object classification. Given the recent prevalence of meta-learning framework, (Bhat et al 2019;Park and Berg 2018) further learns to learn during tracking. Comparatively speaking, online learning for siamese-network-based trackers has had less attention.…”

Section: Related Workmentioning

confidence: 99%

Discriminative and Robust Online Learning for Siamese Visual Tracking

Zhou

Wang

Sun

2020

AAAI

View full text Add to dashboard Cite

The problem of visual object tracking has traditionally been handled by variant tracking paradigms, either learning a model of the object's appearance exclusively online or matching the object with the target in an offline-trained embedding space. Despite the recent success, each method agonizes over its intrinsic constraint. The online-only approaches suffer from a lack of generalization of the model they learn thus are inferior in target regression, while the offline-only approaches (e.g., convolutional siamese trackers) lack the target-specific context information thus are not discriminative enough to handle distractors, and robust enough to deformation. Therefore, we propose an online module with an attention mechanism for offline siamese networks to extract target-specific features under L2 error. We further propose a filter update strategy adaptive to treacherous background noises for discriminative learning, and a template update strategy to handle large target deformations for robust learning. Effectiveness can be validated in the consistent improvement over three siamese baselines: SiamFC, SiamRPN++, and SiamMask. Beyond that, our model based on SiamRPN++ obtains the best results over six popular tracking benchmarks and can operate beyond real-time.

show abstract

SiamUT: Siamese Unsymmetrical Transformer-like Tracking

et al. 2023

View full text Add to dashboard Cite

Siamese networks have proven to be suitable for many computer vision tasks, including single object tracking. These trackers leverage the siamese structure to benefit from feature cross-correlation, which measures the similarity between a target template and the corresponding search region. However, the linear nature of the correlation operation leads to the loss of important semantic information and may result in suboptimal performance when faced with complex background interference or significant object deformations. In this paper, we introduce the Transformer structure, which has been successful in vision tasks, to enhance the siamese network’s performance in challenging conditions. By incorporating self-attention and cross-attention mechanisms, we modify the original Transformer into an asymmetrical version that can focus on different regions of the feature map. This transformer-like fusion network enables more efficient and effective fusion procedures. Additionally, we introduce a two-layer output structure with decoupling prediction heads, improved loss functions, and window penalty post-processing. This design enhances the performance of both the classification and the regression branches. Extensive experiments conducted on large public datasets such as LaSOT, GOT-10k, and TrackingNet demonstrate that our proposed SiamUT tracker achieves state-of-the-art precision performance on most benchmark datasets.

show abstract

Learning Discriminative Model Prediction for Tracking

Cited by 5 publications

References 0 publications

Real-Time Visual Object Tracking via Few-Shot Learning

Real-Time Visual Object Tracking via Few-Shot Learning

Discriminative and Robust Online Learning for Siamese Visual Tracking

SiamUT: Siamese Unsymmetrical Transformer-like Tracking

Contact Info

Product

Resources

About