The object tracking algorithm based on Siamese network often extracts the deep feature of the target to be tracked from the first frame of the video sequence as a template, and uses the template for the whole tracking process. Because the manually annotated target in the first frame of video sequence is more accurate, these algorithms often have stable performance. However, it is difficult to adapt to the changing target features only using the target template extracted from the first frame. Inspired by the feature fusion network based on a transformer, this paper proposes a template update module called multi‐template temporary information fusion module (MTFM), which can be trained offline. By fusing multiple target template features on time series, the template can always adapt to the changes of target appearance in the tracking process. In order to train the MTFM, this paper proposes a training method using time series data and Mean Square Error (MSE) as the loss function. This paper uses the MTFM on SiamFC++ tracker, and obtains good experimental results in three challenging datasets, including VOT2016, OTB100 and GOT‐10k. The running speed of the algorithm on graphics processing unit (GPU) is maintained at about 200fps, which exhibits good real‐time performance.
SiamFC++ only extracts the object feature of the first frame as a tracking template, and only uses the highest-level feature maps in both the classification branch and the regression branch, so that the respective characteristics of the two branches are not fully utilized. In view of this, the present paper proposes an object tracking algorithm based on SiamFC++. The algorithm uses the multi-layer features of the Siamese network to update template. First, FPN is used to extract feature maps from different layers of Backbone for classification branch and regression branch. Second, 3D convolution is used to update the tracking template of the object tracking algorithm. Next, a template update judgment condition is proposed based on mutual information. Finally, AlexNet is used as the backbone and GOT-10K as training set. Compared with SiamFC++, our algorithm obtains improved results on OTB100, VOT2016, VOT2018 and GOT-10k datasets, and the tracking process is real-time.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.