Sparse representation is a viable solution to visual tracking. In this paper, we propose a structured multi-task multi-view tracking (SMTMVT) method, which exploits the sparse appearance model in the particle filter framework to track targets under different challenges. Specifically, we extract features of the target candidates from different views and sparsely represent them by a linear combination of templates of different views. Unlike the conventional sparse trackers, SMTMVT not only jointly considers the relationship between different tasks and different views but also retains the structures among different views in a robust multi-task multi-view formulation. We introduce a numerical algorithm based on the proximal gradient method to quickly and effectively find the sparsity by dividing the optimization problem into two subproblems with the closed-form solutions. Both qualitative and quantitative evaluations on the benchmark of challenging image sequences demonstrate the superior performance of the proposed tracker against various state-of-the-art trackers.
Sparse representation has recently been successfully applied in visual tracking. It utilizes a set of templates to represent target candidates and find the best one with the minimum reconstruction error as the tracking result. In this paper, we propose a robust deep features-based structured group local sparse tracker (DF-SGLST), which exploits the deep features of local patches inside target candidates and represents them by a set of templates in the particle filter framework. Unlike the conventional local sparse trackers, the proposed optimization model in DF-SGLST employs a group-sparsity regularization term to seamlessly adopt local and spatial information of the target candidates and attain the spatial layout structure among them. To solve the optimization model, we propose an efficient and fast numerical algorithm that consists of two subproblems with the closed-form solutions. Different evaluations in terms of success and precision on the benchmarks of challenging image sequences (e.g., OTB50 and OTB100) demonstrate the superior performance of the proposed tracker against several state-of-the-art trackers. Precision Precision plots of OPE -out of view (6) TGPR_HOG [0.785] MEEM [0.727] RSST_Deep [0.688] SRDCF [0.680] STAPLE [0.679] KCF [0.650] SMTMVT [0.642] SAMF [0.636] MTMVTLAD [0.607] RSST_HOG [0.600] DF-SGLST [0.588] TLD [0.576] MTMVTLS [0.573] LOT [0.567] Struck [0.539] LSK [0.515] DSST [0.511] CXT [0.510] TM-V [0.502] CNT [0.502] Fig. 3: The OPE success plots and precision plots for each of 11 challenge subsets in OTB50. Precision Precision plots of OPE -deformation (44) HDT [0.821] CNN-SVM [0.793] CF2 [0.791] MEEM [0.754] STAPLE [0.748] SRDCF [0.734] DF-SGLST [0.727] DLSSVM [0.727] RSST_Deep [0.717] LCT [0.689] SAMF [0.682] SGLST_HOG [0.628] TGPR_HOG [0.624] KCF [0.617] RSST_HOG [0.594] DSST [0.550] SGLST_Color [0.543] SCM [0.537] Struck [0.527] CPF [0.508]
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.